Re: Review Request 66190: Fix 'PreemptorSlotSearchBenchmark', remove 'isProduction' references in benchmark

2018-03-20 Thread David McLaughlin

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/66190/#review199636
---


Ship it!




Ship It!

- David McLaughlin


On March 21, 2018, 12:26 a.m., Jordan Ly wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/66190/
> ---
> 
> (Updated March 21, 2018, 12:26 a.m.)
> 
> 
> Review request for Aurora, David McLaughlin, Santhosh Kumar Shanmugham, and 
> Stephan Erb.
> 
> 
> Repository: aurora
> 
> 
> Description
> ---
> 
> This benchmark was using the deprecated `production` flag when building the 
> tasks for the cluster. `PendingTaskProcessor` depends on `tier` instead, so 
> this benchmark ended up not testing the correct codepath.
> 
> Removed references to `production` and added `tier` instead. Additionally, 
> removed some unused options.
> 
> 
> Diffs
> -
> 
>   src/jmh/java/org/apache/aurora/benchmark/BenchmarkSettings.java 
> ddab2eccb22a93ecb67481f399707d2d82df5db2 
>   src/jmh/java/org/apache/aurora/benchmark/SchedulingBenchmarks.java 
> 1f9a5764b502f939f0345ff99fb0fc2830b4c2f0 
>   src/jmh/java/org/apache/aurora/benchmark/Tasks.java 
> 60c62bbf3061650a5dd8654045dc8189293d0190 
> 
> 
> Diff: https://reviews.apache.org/r/66190/diff/2/
> 
> 
> Testing
> ---
> 
> Old:
> ```
> # Run complete. Total time: 00:08:32
> 
> Benchmark 
>(numPendingTasks)   Mode  Cnt Score Error  
>  Units
> SchedulingBenchmarks.PreemptorSlotSearchBenchmark.runBenchmark
>1  thrpt   1057.670 ±  20.451  
>  ops/s
> SchedulingBenchmarks.PreemptorSlotSearchBenchmark.runBenchmark:·gc.alloc.rate 
>1  thrpt   10   595.374 ± 211.805  
> MB/sec
> SchedulingBenchmarks.PreemptorSlotSearchBenchmark.runBenchmark:·gc.alloc.rate.norm
>1  thrpt   10  10830342.916 ± 380.919
> B/op
> SchedulingBenchmarks.PreemptorSlotSearchBenchmark.runBenchmark:·gc.churn.PS_Eden_Space
>1  thrpt   10   593.530 ± 222.002  MB/sec
> SchedulingBenchmarks.PreemptorSlotSearchBenchmark.runBenchmark:·gc.churn.PS_Eden_Space.norm
>   1  thrpt   10  10717947.102 ± 1280229.296B/op
> SchedulingBenchmarks.PreemptorSlotSearchBenchmark.runBenchmark:·gc.churn.PS_Survivor_Space
>1  thrpt   10 0.305 ±   1.264  MB/sec
> SchedulingBenchmarks.PreemptorSlotSearchBenchmark.runBenchmark:·gc.churn.PS_Survivor_Space.norm
>   1  thrpt   10 13552.434 ±   61403.918B/op
> SchedulingBenchmarks.PreemptorSlotSearchBenchmark.runBenchmark:·gc.count  
>1  thrpt   1060.000
> counts
> SchedulingBenchmarks.PreemptorSlotSearchBenchmark.runBenchmark:·gc.time   
>1  thrpt   10   202.000
> ms
> SchedulingBenchmarks.PreemptorSlotSearchBenchmark.runBenchmark:·stack 
>1  thrptNaN
>---
> SchedulingBenchmarks.PreemptorSlotSearchBenchmark.runBenchmark
>   10  thrpt   1052.161 ±   8.526  
>  ops/s
> SchedulingBenchmarks.PreemptorSlotSearchBenchmark.runBenchmark:·gc.alloc.rate 
>   10  thrpt   10   550.771 ±  89.939  
> MB/sec
> SchedulingBenchmarks.PreemptorSlotSearchBenchmark.runBenchmark:·gc.alloc.rate.norm
>   10  thrpt   10  11074211.352 ± 318.376
> B/op
> SchedulingBenchmarks.PreemptorSlotSearchBenchmark.runBenchmark:·gc.churn.PS_Eden_Space
>   10  thrpt   10   550.125 ± 107.470  MB/sec
> SchedulingBenchmarks.PreemptorSlotSearchBenchmark.runBenchmark:·gc.churn.PS_Eden_Space.norm
>  10  thrpt   10  11073792.311 ± 1621636.993B/op
> SchedulingBenchmarks.PreemptorSlotSearchBenchmark.runBenchmark:·gc.churn.PS_Survivor_Space
>   10  thrpt   10 0.038 ±   0.049  MB/sec
> SchedulingBenchmarks.PreemptorSlotSearchBenchmark.runBenchmark:·gc.churn.PS_Survivor_Space.norm
>  10  thrpt   10   737.753 ± 919.460B/op
> SchedulingBenchmarks.PreemptorSlotSearchBenchmark.runBenchmark:·gc.count  
>   10  thrpt   1055.000
> counts
> SchedulingBenchmarks.PreemptorSlotSearchBenchmark.runBenchmark:·gc.time   
>   10  thrpt   10   155.000
> ms
> SchedulingBenchmarks.P

Re: Review Request 66192: [WIP] Variable group size updates

2018-03-20 Thread David McLaughlin

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/66192/#review199635
---




src/main/java/org/apache/aurora/scheduler/updater/strategy/VariableBatchStrategy.java
Lines 107-111 (patched)


Is this idea of a step sate needed? Strategies writing to the database 
seems like a design smell. 

If I have 10 instances:

[0,1,2,3,4,5,6,7,8,9] 

And I pass a VariableBatchStrategy of: [1, 4, 5]

Then using the BatchStrategy it is *always* in the following order:

[[0], [1,2,3,4], [5,6,7,8,9]]

And I don't *think* I need to write the step to the database to figure out 
which step I'm in? I can use the number idle and active to figure out exactly 
how many have been processed.


- David McLaughlin


On March 21, 2018, 2:10 a.m., Renan DelValle wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/66192/
> ---
> 
> (Updated March 21, 2018, 2:10 a.m.)
> 
> 
> Review request for Aurora, David McLaughlin, Santhosh Kumar Shanmugham, and 
> Stephan Erb.
> 
> 
> Repository: aurora
> 
> 
> Description
> ---
> 
> Adding support for variable group sizes when executing an update.
> 
> Design doc for this change is here: 
> https://docs.google.com/document/d/1xGk4ueH8YlmJCk6hQJh85u4to4M1VQD0l630IOchvgY/edit#heading=h.lg3hty82f5cz
> 
> I opted for the path of least resistance with regards to the Thrift changes 
> as I didn't see any benefit in making the larger changes required to make the 
> interfaces a bit more flexible.
> 
> Requesting feedback on these changes and the approach from the community 
> before I proceed.
> 
> Tests will be added after the community approves of the direciton and 
> approach.
> 
> Note to reviewers: Changes made in ActiveLimitedStrategy.java were made to 
> move towards getting rid of FluentIterable. I figured since I was touching 
> that code, it wouldn't hurt to test the Java 8 equivalent of it. I can get 
> rid of the change here and make it in a separate patch if desired.
> 
> 
> Diffs
> -
> 
>   api/src/main/thrift/org/apache/aurora/gen/api.thrift 
> ef754e32172e7490a47a13e7b526f243ffa3efeb 
>   api/src/main/thrift/org/apache/aurora/gen/storage.thrift 
> b79e2045ccda05d5058565f81988dfe33feea8f1 
>   src/main/java/org/apache/aurora/scheduler/storage/JobUpdateStore.java 
> b25f3831cecc58c90375c90b16142421f8f09e38 
>   src/main/java/org/apache/aurora/scheduler/storage/durability/Loader.java 
> 10864f122eff5027c88d835baae6de483d960218 
>   
> src/main/java/org/apache/aurora/scheduler/storage/durability/WriteRecorder.java
>  8d70cae35289a9e36142bab288cf0c9398ebd2d4 
>   
> src/main/java/org/apache/aurora/scheduler/storage/mem/MemJobUpdateStore.java 
> 9e86b9e276ea90a249284a824705b5bbf19dcbce 
>   
> src/main/java/org/apache/aurora/scheduler/thrift/SchedulerThriftInterface.java
>  9fc0416086dd3eb2e2f4e8f659da59fcdea2b22b 
>   
> src/main/java/org/apache/aurora/scheduler/updater/JobUpdateControllerImpl.java
>  f8be8058f3a80a18b999d2666e2adb33e1e55fef 
>   src/main/java/org/apache/aurora/scheduler/updater/UpdateFactory.java 
> 3992aa77fc305adc390a4aaeb1d3939d6241ddbd 
>   
> src/main/java/org/apache/aurora/scheduler/updater/strategy/ActiveLimitedStrategy.java
>  855ea9c20788b51695b7eff5ac0970f0d52a9546 
>   
> src/main/java/org/apache/aurora/scheduler/updater/strategy/VariableBatchStrategy.java
>  PRE-CREATION 
> 
> 
> Diff: https://reviews.apache.org/r/66192/diff/1/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Renan DelValle
> 
>



Re: Review Request 66192: [WIP] Variable group size updates

2018-03-20 Thread Aurora ReviewBot

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/66192/#review199632
---



Master (f32086d) is red with this patch.
  ./build-support/jenkins/build.sh

  found: IJobUpdateInstructions,boolean
  reason: actual and formal argument lists differ in length
/home/jenkins/jenkins-slave/workspace/AuroraBot/src/test/java/org/apache/aurora/scheduler/updater/UpdateFactoryImplTest.java:78:
 error: method newUpdate in interface UpdateFactory cannot be applied to given 
types;
Update update = factory.newUpdate(IJobUpdateInstructions.build(config), 
true);
   ^
  required: IJobUpdate,boolean,Storage
  found: IJobUpdateInstructions,boolean
  reason: actual and formal argument lists differ in length
/home/jenkins/jenkins-slave/workspace/AuroraBot/src/test/java/org/apache/aurora/scheduler/updater/UpdateFactoryImplTest.java:89:
 error: method newUpdate in interface UpdateFactory cannot be applied to given 
types;
Update update = factory.newUpdate(IJobUpdateInstructions.build(config), 
false);
   ^
  required: IJobUpdate,boolean,Storage
  found: IJobUpdateInstructions,boolean
  reason: actual and formal argument lists differ in length
/home/jenkins/jenkins-slave/workspace/AuroraBot/src/test/java/org/apache/aurora/scheduler/updater/UpdateFactoryImplTest.java:98:
 error: method newUpdate in interface UpdateFactory cannot be applied to given 
types;
Update update = factory.newUpdate(IJobUpdateInstructions.build(config), 
true);
   ^
  required: IJobUpdate,boolean,Storage
  found: IJobUpdateInstructions,boolean
  reason: actual and formal argument lists differ in length
Note: Some input files use or override a deprecated API.
Note: Recompile with -Xlint:deprecation for details.
Note: 
/home/jenkins/jenkins-slave/workspace/AuroraBot/src/test/java/org/apache/aurora/scheduler/storage/durability/DurableStorageTest.java
 uses unchecked or unsafe operations.
Note: Recompile with -Xlint:unchecked for details.
5 errors
 FAILED

FAILURE: Build failed with an exception.

* What went wrong:
Execution failed for task ':compileTestJava'.
> Compilation failed; see the compiler error output for details.

* Try:
Run with --stacktrace option to get the stack trace. Run with --info or --debug 
option to get more log output.

* Get more help at https://help.gradle.org

BUILD FAILED in 3m 48s
28 actionable tasks: 22 executed, 6 up-to-date


I will refresh this build result if you post a review containing "@ReviewBot 
retry"

- Aurora ReviewBot


On March 21, 2018, 2:10 a.m., Renan DelValle wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/66192/
> ---
> 
> (Updated March 21, 2018, 2:10 a.m.)
> 
> 
> Review request for Aurora, David McLaughlin, Santhosh Kumar Shanmugham, and 
> Stephan Erb.
> 
> 
> Repository: aurora
> 
> 
> Description
> ---
> 
> Adding support for variable group sizes when executing an update.
> 
> Design doc for this change is here: 
> https://docs.google.com/document/d/1xGk4ueH8YlmJCk6hQJh85u4to4M1VQD0l630IOchvgY/edit#heading=h.lg3hty82f5cz
> 
> I opted for the path of least resistance with regards to the Thrift changes 
> as I didn't see any benefit in making the larger changes required to make the 
> interfaces a bit more flexible.
> 
> Requesting feedback on these changes and the approach from the community 
> before I proceed.
> 
> Tests will be added after the community approves of the direciton and 
> approach.
> 
> Note to reviewers: Changes made in ActiveLimitedStrategy.java were made to 
> move towards getting rid of FluentIterable. I figured since I was touching 
> that code, it wouldn't hurt to test the Java 8 equivalent of it. I can get 
> rid of the change here and make it in a separate patch if desired.
> 
> 
> Diffs
> -
> 
>   api/src/main/thrift/org/apache/aurora/gen/api.thrift 
> ef754e32172e7490a47a13e7b526f243ffa3efeb 
>   api/src/main/thrift/org/apache/aurora/gen/storage.thrift 
> b79e2045ccda05d5058565f81988dfe33feea8f1 
>   src/main/java/org/apache/aurora/scheduler/storage/JobUpdateStore.java 
> b25f3831cecc58c90375c90b16142421f8f09e38 
>   src/main/java/org/apache/aurora/scheduler/storage/durability/Loader.java 
> 10864f122eff5027c88d835baae6de483d960218 
>   
> src/main/java/org/apache/aurora/scheduler/storage/durability/WriteRecorder.java
>  8d70cae35289a9e36142bab288cf0c9398ebd2d4 
>   
> src/main/java/org/apache/aurora/scheduler/storage/mem/MemJobUpdateStore.java 
> 9e86b9e276ea90a249284a824705b5bbf19dcbce 
>   
> src/main/java/org/apache/aurora/scheduler/thrift/SchedulerThriftInterface.java
>  9fc0416086dd3eb2e2f4e8f659da59fcdea2b22b 
>   
> src/main/java/org/apache/aurora/scheduler/update

Review Request 66192: [WIP] Variable group size updates

2018-03-20 Thread Renan DelValle

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/66192/
---

Review request for Aurora, David McLaughlin, Santhosh Kumar Shanmugham, and 
Stephan Erb.


Repository: aurora


Description
---

Adding support for variable group sizes when executing an update.

Design doc for this change is here: 
https://docs.google.com/document/d/1xGk4ueH8YlmJCk6hQJh85u4to4M1VQD0l630IOchvgY/edit#heading=h.lg3hty82f5cz

I opted for the path of least resistance with regards to the Thrift changes as 
I didn't see any benefit in making the larger changes required to make the 
interfaces a bit more flexible.

Requesting feedback on these changes and the approach from the community before 
I proceed.

Tests will be added after the community approves of the direciton and approach.

Note to reviewers: Changes made in ActiveLimitedStrategy.java were made to move 
towards getting rid of FluentIterable. I figured since I was touching that 
code, it wouldn't hurt to test the Java 8 equivalent of it. I can get rid of 
the change here and make it in a separate patch if desired.


Diffs
-

  api/src/main/thrift/org/apache/aurora/gen/api.thrift 
ef754e32172e7490a47a13e7b526f243ffa3efeb 
  api/src/main/thrift/org/apache/aurora/gen/storage.thrift 
b79e2045ccda05d5058565f81988dfe33feea8f1 
  src/main/java/org/apache/aurora/scheduler/storage/JobUpdateStore.java 
b25f3831cecc58c90375c90b16142421f8f09e38 
  src/main/java/org/apache/aurora/scheduler/storage/durability/Loader.java 
10864f122eff5027c88d835baae6de483d960218 
  
src/main/java/org/apache/aurora/scheduler/storage/durability/WriteRecorder.java 
8d70cae35289a9e36142bab288cf0c9398ebd2d4 
  src/main/java/org/apache/aurora/scheduler/storage/mem/MemJobUpdateStore.java 
9e86b9e276ea90a249284a824705b5bbf19dcbce 
  
src/main/java/org/apache/aurora/scheduler/thrift/SchedulerThriftInterface.java 
9fc0416086dd3eb2e2f4e8f659da59fcdea2b22b 
  
src/main/java/org/apache/aurora/scheduler/updater/JobUpdateControllerImpl.java 
f8be8058f3a80a18b999d2666e2adb33e1e55fef 
  src/main/java/org/apache/aurora/scheduler/updater/UpdateFactory.java 
3992aa77fc305adc390a4aaeb1d3939d6241ddbd 
  
src/main/java/org/apache/aurora/scheduler/updater/strategy/ActiveLimitedStrategy.java
 855ea9c20788b51695b7eff5ac0970f0d52a9546 
  
src/main/java/org/apache/aurora/scheduler/updater/strategy/VariableBatchStrategy.java
 PRE-CREATION 


Diff: https://reviews.apache.org/r/66192/diff/1/


Testing
---


Thanks,

Renan DelValle



Re: Review Request 66190: Fix 'PreemptorSlotSearchBenchmark', remove 'isProduction' references in benchmark

2018-03-20 Thread Santhosh Kumar Shanmugham

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/66190/#review199628
---


Ship it!




Ship It!

- Santhosh Kumar Shanmugham


On March 20, 2018, 5:26 p.m., Jordan Ly wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/66190/
> ---
> 
> (Updated March 20, 2018, 5:26 p.m.)
> 
> 
> Review request for Aurora, David McLaughlin, Santhosh Kumar Shanmugham, and 
> Stephan Erb.
> 
> 
> Repository: aurora
> 
> 
> Description
> ---
> 
> This benchmark was using the deprecated `production` flag when building the 
> tasks for the cluster. `PendingTaskProcessor` depends on `tier` instead, so 
> this benchmark ended up not testing the correct codepath.
> 
> Removed references to `production` and added `tier` instead. Additionally, 
> removed some unused options.
> 
> 
> Diffs
> -
> 
>   src/jmh/java/org/apache/aurora/benchmark/BenchmarkSettings.java 
> ddab2eccb22a93ecb67481f399707d2d82df5db2 
>   src/jmh/java/org/apache/aurora/benchmark/SchedulingBenchmarks.java 
> 1f9a5764b502f939f0345ff99fb0fc2830b4c2f0 
>   src/jmh/java/org/apache/aurora/benchmark/Tasks.java 
> 60c62bbf3061650a5dd8654045dc8189293d0190 
> 
> 
> Diff: https://reviews.apache.org/r/66190/diff/2/
> 
> 
> Testing
> ---
> 
> Old:
> ```
> # Run complete. Total time: 00:08:32
> 
> Benchmark 
>(numPendingTasks)   Mode  Cnt Score Error  
>  Units
> SchedulingBenchmarks.PreemptorSlotSearchBenchmark.runBenchmark
>1  thrpt   1057.670 ±  20.451  
>  ops/s
> SchedulingBenchmarks.PreemptorSlotSearchBenchmark.runBenchmark:·gc.alloc.rate 
>1  thrpt   10   595.374 ± 211.805  
> MB/sec
> SchedulingBenchmarks.PreemptorSlotSearchBenchmark.runBenchmark:·gc.alloc.rate.norm
>1  thrpt   10  10830342.916 ± 380.919
> B/op
> SchedulingBenchmarks.PreemptorSlotSearchBenchmark.runBenchmark:·gc.churn.PS_Eden_Space
>1  thrpt   10   593.530 ± 222.002  MB/sec
> SchedulingBenchmarks.PreemptorSlotSearchBenchmark.runBenchmark:·gc.churn.PS_Eden_Space.norm
>   1  thrpt   10  10717947.102 ± 1280229.296B/op
> SchedulingBenchmarks.PreemptorSlotSearchBenchmark.runBenchmark:·gc.churn.PS_Survivor_Space
>1  thrpt   10 0.305 ±   1.264  MB/sec
> SchedulingBenchmarks.PreemptorSlotSearchBenchmark.runBenchmark:·gc.churn.PS_Survivor_Space.norm
>   1  thrpt   10 13552.434 ±   61403.918B/op
> SchedulingBenchmarks.PreemptorSlotSearchBenchmark.runBenchmark:·gc.count  
>1  thrpt   1060.000
> counts
> SchedulingBenchmarks.PreemptorSlotSearchBenchmark.runBenchmark:·gc.time   
>1  thrpt   10   202.000
> ms
> SchedulingBenchmarks.PreemptorSlotSearchBenchmark.runBenchmark:·stack 
>1  thrptNaN
>---
> SchedulingBenchmarks.PreemptorSlotSearchBenchmark.runBenchmark
>   10  thrpt   1052.161 ±   8.526  
>  ops/s
> SchedulingBenchmarks.PreemptorSlotSearchBenchmark.runBenchmark:·gc.alloc.rate 
>   10  thrpt   10   550.771 ±  89.939  
> MB/sec
> SchedulingBenchmarks.PreemptorSlotSearchBenchmark.runBenchmark:·gc.alloc.rate.norm
>   10  thrpt   10  11074211.352 ± 318.376
> B/op
> SchedulingBenchmarks.PreemptorSlotSearchBenchmark.runBenchmark:·gc.churn.PS_Eden_Space
>   10  thrpt   10   550.125 ± 107.470  MB/sec
> SchedulingBenchmarks.PreemptorSlotSearchBenchmark.runBenchmark:·gc.churn.PS_Eden_Space.norm
>  10  thrpt   10  11073792.311 ± 1621636.993B/op
> SchedulingBenchmarks.PreemptorSlotSearchBenchmark.runBenchmark:·gc.churn.PS_Survivor_Space
>   10  thrpt   10 0.038 ±   0.049  MB/sec
> SchedulingBenchmarks.PreemptorSlotSearchBenchmark.runBenchmark:·gc.churn.PS_Survivor_Space.norm
>  10  thrpt   10   737.753 ± 919.460B/op
> SchedulingBenchmarks.PreemptorSlotSearchBenchmark.runBenchmark:·gc.count  
>   10  thrpt   1055.000
> counts
> SchedulingBenchmarks.PreemptorSlotSearchBenchmark.runBenchmark:·gc.time   
>   10  thrpt   10   155.000
> ms
> SchedulingBench

Re: Review Request 66154: Adding support for using custom executors via the Aurora DSL

2018-03-20 Thread Santhosh Kumar Shanmugham

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/66154/#review199627
---


Ship it!




Ship It!

- Santhosh Kumar Shanmugham


On March 20, 2018, 4:24 p.m., Renan DelValle wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/66154/
> ---
> 
> (Updated March 20, 2018, 4:24 p.m.)
> 
> 
> Review request for Aurora, Jordan Ly, Santhosh Kumar Shanmugham, and Stephan 
> Erb.
> 
> 
> Bugs: AURORA-1981
> https://issues.apache.org/jira/browse/AURORA-1981
> 
> 
> Repository: aurora
> 
> 
> Description
> ---
> 
> Adding executor_config as a field to MesosJob in order to allow the use of 
> custom executors through the DSL.
> 
> First patch in a series to catch up on some work I have backlogged regarding 
> custom executors.
> 
> 
> Diffs
> -
> 
>   RELEASE-NOTES.md b5d06c42badd3b623ad365a5d34a7162e8c782f2 
>   docs/features/custom-executors.md 1357c1ed334dbd90ac9937c557766ce31686c293 
>   docs/reference/configuration.md 725e073e8057f0e5e17f50285d86c89638c96d16 
>   src/main/python/apache/aurora/config/schema/base.py 
> 3d57d6a68b9553938c8ef8a15e6ac88d466f7c3c 
>   src/main/python/apache/aurora/config/thrift.py 
> dcabb03500c8e60d5585f0e65c74e163bd93e43a 
>   src/test/python/apache/aurora/client/cli/test_inspect.py 
> 8c624800eed383383dfb07c10aae103c54875728 
>   src/test/python/apache/aurora/config/test_thrift.py 
> 7bf050836633780401c5d8a371e006bde025c420 
> 
> 
> Diff: https://reviews.apache.org/r/66154/diff/3/
> 
> 
> Testing
> ---
> 
> ./build-support/jenkins/build.sh
> 
> ./src/test/sh/org/apache/aurora/e2e/test_end_to_end.sh
> 
> 
> Thanks,
> 
> Renan DelValle
> 
>



Re: Review Request 66154: Adding support for using custom executors via the Aurora DSL

2018-03-20 Thread Santhosh Kumar Shanmugham


> On March 20, 2018, 5:06 p.m., Santhosh Kumar Shanmugham wrote:
> > docs/features/custom-executors.md
> > Line 151 (original), 151 (patched)
> > 
> >
> > Link to 
> > http://aurora.apache.org/documentation/latest/features/custom-executors/ ?
> 
> Renan DelValle wrote:
> Unless  I misunderstood you, `docs/features/custom-executors.md` is 
> converted to 
> http://aurora.apache.org/documentation/latest/features/custom-executors/ when 
> put through Rake2.0. So in this case the page would end up linking to itself 
> :).

You are correct. Please ignore.


- Santhosh Kumar


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/66154/#review199509
---


On March 20, 2018, 4:24 p.m., Renan DelValle wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/66154/
> ---
> 
> (Updated March 20, 2018, 4:24 p.m.)
> 
> 
> Review request for Aurora, Jordan Ly, Santhosh Kumar Shanmugham, and Stephan 
> Erb.
> 
> 
> Bugs: AURORA-1981
> https://issues.apache.org/jira/browse/AURORA-1981
> 
> 
> Repository: aurora
> 
> 
> Description
> ---
> 
> Adding executor_config as a field to MesosJob in order to allow the use of 
> custom executors through the DSL.
> 
> First patch in a series to catch up on some work I have backlogged regarding 
> custom executors.
> 
> 
> Diffs
> -
> 
>   RELEASE-NOTES.md b5d06c42badd3b623ad365a5d34a7162e8c782f2 
>   docs/features/custom-executors.md 1357c1ed334dbd90ac9937c557766ce31686c293 
>   docs/reference/configuration.md 725e073e8057f0e5e17f50285d86c89638c96d16 
>   src/main/python/apache/aurora/config/schema/base.py 
> 3d57d6a68b9553938c8ef8a15e6ac88d466f7c3c 
>   src/main/python/apache/aurora/config/thrift.py 
> dcabb03500c8e60d5585f0e65c74e163bd93e43a 
>   src/test/python/apache/aurora/client/cli/test_inspect.py 
> 8c624800eed383383dfb07c10aae103c54875728 
>   src/test/python/apache/aurora/config/test_thrift.py 
> 7bf050836633780401c5d8a371e006bde025c420 
> 
> 
> Diff: https://reviews.apache.org/r/66154/diff/3/
> 
> 
> Testing
> ---
> 
> ./build-support/jenkins/build.sh
> 
> ./src/test/sh/org/apache/aurora/e2e/test_end_to_end.sh
> 
> 
> Thanks,
> 
> Renan DelValle
> 
>



Re: Review Request 66190: Fix 'PreemptorSlotSearchBenchmark', remove 'isProduction' references in benchmark

2018-03-20 Thread Aurora ReviewBot

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/66190/#review199625
---


Ship it!




Master (b3fa9fe) is green with this patch.
  ./build-support/jenkins/build.sh

I will refresh this build result if you post a review containing "@ReviewBot 
retry"

- Aurora ReviewBot


On March 20, 2018, 5:26 p.m., Jordan Ly wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/66190/
> ---
> 
> (Updated March 20, 2018, 5:26 p.m.)
> 
> 
> Review request for Aurora, David McLaughlin, Santhosh Kumar Shanmugham, and 
> Stephan Erb.
> 
> 
> Repository: aurora
> 
> 
> Description
> ---
> 
> This benchmark was using the deprecated `production` flag when building the 
> tasks for the cluster. `PendingTaskProcessor` depends on `tier` instead, so 
> this benchmark ended up not testing the correct codepath.
> 
> Removed references to `production` and added `tier` instead. Additionally, 
> removed some unused options.
> 
> 
> Diffs
> -
> 
>   src/jmh/java/org/apache/aurora/benchmark/BenchmarkSettings.java 
> ddab2eccb22a93ecb67481f399707d2d82df5db2 
>   src/jmh/java/org/apache/aurora/benchmark/SchedulingBenchmarks.java 
> 1f9a5764b502f939f0345ff99fb0fc2830b4c2f0 
>   src/jmh/java/org/apache/aurora/benchmark/Tasks.java 
> 60c62bbf3061650a5dd8654045dc8189293d0190 
> 
> 
> Diff: https://reviews.apache.org/r/66190/diff/2/
> 
> 
> Testing
> ---
> 
> Old:
> ```
> # Run complete. Total time: 00:08:32
> 
> Benchmark 
>(numPendingTasks)   Mode  Cnt Score Error  
>  Units
> SchedulingBenchmarks.PreemptorSlotSearchBenchmark.runBenchmark
>1  thrpt   1057.670 ±  20.451  
>  ops/s
> SchedulingBenchmarks.PreemptorSlotSearchBenchmark.runBenchmark:·gc.alloc.rate 
>1  thrpt   10   595.374 ± 211.805  
> MB/sec
> SchedulingBenchmarks.PreemptorSlotSearchBenchmark.runBenchmark:·gc.alloc.rate.norm
>1  thrpt   10  10830342.916 ± 380.919
> B/op
> SchedulingBenchmarks.PreemptorSlotSearchBenchmark.runBenchmark:·gc.churn.PS_Eden_Space
>1  thrpt   10   593.530 ± 222.002  MB/sec
> SchedulingBenchmarks.PreemptorSlotSearchBenchmark.runBenchmark:·gc.churn.PS_Eden_Space.norm
>   1  thrpt   10  10717947.102 ± 1280229.296B/op
> SchedulingBenchmarks.PreemptorSlotSearchBenchmark.runBenchmark:·gc.churn.PS_Survivor_Space
>1  thrpt   10 0.305 ±   1.264  MB/sec
> SchedulingBenchmarks.PreemptorSlotSearchBenchmark.runBenchmark:·gc.churn.PS_Survivor_Space.norm
>   1  thrpt   10 13552.434 ±   61403.918B/op
> SchedulingBenchmarks.PreemptorSlotSearchBenchmark.runBenchmark:·gc.count  
>1  thrpt   1060.000
> counts
> SchedulingBenchmarks.PreemptorSlotSearchBenchmark.runBenchmark:·gc.time   
>1  thrpt   10   202.000
> ms
> SchedulingBenchmarks.PreemptorSlotSearchBenchmark.runBenchmark:·stack 
>1  thrptNaN
>---
> SchedulingBenchmarks.PreemptorSlotSearchBenchmark.runBenchmark
>   10  thrpt   1052.161 ±   8.526  
>  ops/s
> SchedulingBenchmarks.PreemptorSlotSearchBenchmark.runBenchmark:·gc.alloc.rate 
>   10  thrpt   10   550.771 ±  89.939  
> MB/sec
> SchedulingBenchmarks.PreemptorSlotSearchBenchmark.runBenchmark:·gc.alloc.rate.norm
>   10  thrpt   10  11074211.352 ± 318.376
> B/op
> SchedulingBenchmarks.PreemptorSlotSearchBenchmark.runBenchmark:·gc.churn.PS_Eden_Space
>   10  thrpt   10   550.125 ± 107.470  MB/sec
> SchedulingBenchmarks.PreemptorSlotSearchBenchmark.runBenchmark:·gc.churn.PS_Eden_Space.norm
>  10  thrpt   10  11073792.311 ± 1621636.993B/op
> SchedulingBenchmarks.PreemptorSlotSearchBenchmark.runBenchmark:·gc.churn.PS_Survivor_Space
>   10  thrpt   10 0.038 ±   0.049  MB/sec
> SchedulingBenchmarks.PreemptorSlotSearchBenchmark.runBenchmark:·gc.churn.PS_Survivor_Space.norm
>  10  thrpt   10   737.753 ± 919.460B/op
> SchedulingBenchmarks.PreemptorSlotSearchBenchmark.runBenchmark:·gc.count  
>   10  thrpt   1055.000
> counts
> SchedulingBenchmarks.PreemptorSlotSearch

Re: Review Request 66154: Adding support for using custom executors via the Aurora DSL

2018-03-20 Thread Aurora ReviewBot

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/66154/#review199624
---


Ship it!




Master (b3fa9fe) is green with this patch.
  ./build-support/jenkins/build.sh

I will refresh this build result if you post a review containing "@ReviewBot 
retry"

- Aurora ReviewBot


On March 20, 2018, 4:24 p.m., Renan DelValle wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/66154/
> ---
> 
> (Updated March 20, 2018, 4:24 p.m.)
> 
> 
> Review request for Aurora, Jordan Ly, Santhosh Kumar Shanmugham, and Stephan 
> Erb.
> 
> 
> Bugs: AURORA-1981
> https://issues.apache.org/jira/browse/AURORA-1981
> 
> 
> Repository: aurora
> 
> 
> Description
> ---
> 
> Adding executor_config as a field to MesosJob in order to allow the use of 
> custom executors through the DSL.
> 
> First patch in a series to catch up on some work I have backlogged regarding 
> custom executors.
> 
> 
> Diffs
> -
> 
>   RELEASE-NOTES.md b5d06c42badd3b623ad365a5d34a7162e8c782f2 
>   docs/features/custom-executors.md 1357c1ed334dbd90ac9937c557766ce31686c293 
>   docs/reference/configuration.md 725e073e8057f0e5e17f50285d86c89638c96d16 
>   src/main/python/apache/aurora/config/schema/base.py 
> 3d57d6a68b9553938c8ef8a15e6ac88d466f7c3c 
>   src/main/python/apache/aurora/config/thrift.py 
> dcabb03500c8e60d5585f0e65c74e163bd93e43a 
>   src/test/python/apache/aurora/client/cli/test_inspect.py 
> 8c624800eed383383dfb07c10aae103c54875728 
>   src/test/python/apache/aurora/config/test_thrift.py 
> 7bf050836633780401c5d8a371e006bde025c420 
> 
> 
> Diff: https://reviews.apache.org/r/66154/diff/3/
> 
> 
> Testing
> ---
> 
> ./build-support/jenkins/build.sh
> 
> ./src/test/sh/org/apache/aurora/e2e/test_end_to_end.sh
> 
> 
> Thanks,
> 
> Renan DelValle
> 
>



Re: Review Request 66154: Adding support for using custom executors via the Aurora DSL

2018-03-20 Thread Renan DelValle


> On March 20, 2018, 5:06 p.m., Santhosh Kumar Shanmugham wrote:
> > RELEASE-NOTES.md
> > Lines 7 (patched)
> > 
> >
> > s/---/-/g

Great catch!


> On March 20, 2018, 5:06 p.m., Santhosh Kumar Shanmugham wrote:
> > docs/features/custom-executors.md
> > Line 151 (original), 151 (patched)
> > 
> >
> > Link to 
> > http://aurora.apache.org/documentation/latest/features/custom-executors/ ?

Unless  I misunderstood you, `docs/features/custom-executors.md` is converted 
to http://aurora.apache.org/documentation/latest/features/custom-executors/ 
when put through Rake2.0. So in this case the page would end up linking to 
itself :).


- Renan


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/66154/#review199509
---


On March 20, 2018, 4:24 p.m., Renan DelValle wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/66154/
> ---
> 
> (Updated March 20, 2018, 4:24 p.m.)
> 
> 
> Review request for Aurora, Jordan Ly, Santhosh Kumar Shanmugham, and Stephan 
> Erb.
> 
> 
> Bugs: AURORA-1981
> https://issues.apache.org/jira/browse/AURORA-1981
> 
> 
> Repository: aurora
> 
> 
> Description
> ---
> 
> Adding executor_config as a field to MesosJob in order to allow the use of 
> custom executors through the DSL.
> 
> First patch in a series to catch up on some work I have backlogged regarding 
> custom executors.
> 
> 
> Diffs
> -
> 
>   RELEASE-NOTES.md b5d06c42badd3b623ad365a5d34a7162e8c782f2 
>   docs/features/custom-executors.md 1357c1ed334dbd90ac9937c557766ce31686c293 
>   docs/reference/configuration.md 725e073e8057f0e5e17f50285d86c89638c96d16 
>   src/main/python/apache/aurora/config/schema/base.py 
> 3d57d6a68b9553938c8ef8a15e6ac88d466f7c3c 
>   src/main/python/apache/aurora/config/thrift.py 
> dcabb03500c8e60d5585f0e65c74e163bd93e43a 
>   src/test/python/apache/aurora/client/cli/test_inspect.py 
> 8c624800eed383383dfb07c10aae103c54875728 
>   src/test/python/apache/aurora/config/test_thrift.py 
> 7bf050836633780401c5d8a371e006bde025c420 
> 
> 
> Diff: https://reviews.apache.org/r/66154/diff/2/
> 
> 
> Testing
> ---
> 
> ./build-support/jenkins/build.sh
> 
> ./src/test/sh/org/apache/aurora/e2e/test_end_to_end.sh
> 
> 
> Thanks,
> 
> Renan DelValle
> 
>



Re: Review Request 66190: Fix 'PreemptorSlotSearchBenchmark', remove 'isProduction' references in benchmark

2018-03-20 Thread Jordan Ly

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/66190/
---

(Updated March 21, 2018, 12:26 a.m.)


Review request for Aurora, David McLaughlin, Santhosh Kumar Shanmugham, and 
Stephan Erb.


Changes
---

Fix title (seemed a bit misleading oops).


Summary (updated)
-

Fix 'PreemptorSlotSearchBenchmark', remove 'isProduction' references in 
benchmark


Repository: aurora


Description
---

This benchmark was using the deprecated `production` flag when building the 
tasks for the cluster. `PendingTaskProcessor` depends on `tier` instead, so 
this benchmark ended up not testing the correct codepath.

Removed references to `production` and added `tier` instead. Additionally, 
removed some unused options.


Diffs (updated)
-

  src/jmh/java/org/apache/aurora/benchmark/BenchmarkSettings.java 
ddab2eccb22a93ecb67481f399707d2d82df5db2 
  src/jmh/java/org/apache/aurora/benchmark/SchedulingBenchmarks.java 
1f9a5764b502f939f0345ff99fb0fc2830b4c2f0 
  src/jmh/java/org/apache/aurora/benchmark/Tasks.java 
60c62bbf3061650a5dd8654045dc8189293d0190 


Diff: https://reviews.apache.org/r/66190/diff/2/

Changes: https://reviews.apache.org/r/66190/diff/1-2/


Testing
---

Old:
```
# Run complete. Total time: 00:08:32

Benchmark   
 (numPendingTasks)   Mode  Cnt Score Error   
Units
SchedulingBenchmarks.PreemptorSlotSearchBenchmark.runBenchmark  
 1  thrpt   1057.670 ±  20.451   
ops/s
SchedulingBenchmarks.PreemptorSlotSearchBenchmark.runBenchmark:·gc.alloc.rate   
 1  thrpt   10   595.374 ± 211.805  
MB/sec
SchedulingBenchmarks.PreemptorSlotSearchBenchmark.runBenchmark:·gc.alloc.rate.norm
   1  thrpt   10  10830342.916 ± 380.919B/op
SchedulingBenchmarks.PreemptorSlotSearchBenchmark.runBenchmark:·gc.churn.PS_Eden_Space
   1  thrpt   10   593.530 ± 222.002  MB/sec
SchedulingBenchmarks.PreemptorSlotSearchBenchmark.runBenchmark:·gc.churn.PS_Eden_Space.norm
  1  thrpt   10  10717947.102 ± 1280229.296B/op
SchedulingBenchmarks.PreemptorSlotSearchBenchmark.runBenchmark:·gc.churn.PS_Survivor_Space
   1  thrpt   10 0.305 ±   1.264  MB/sec
SchedulingBenchmarks.PreemptorSlotSearchBenchmark.runBenchmark:·gc.churn.PS_Survivor_Space.norm
  1  thrpt   10 13552.434 ±   61403.918B/op
SchedulingBenchmarks.PreemptorSlotSearchBenchmark.runBenchmark:·gc.count
 1  thrpt   1060.000
counts
SchedulingBenchmarks.PreemptorSlotSearchBenchmark.runBenchmark:·gc.time 
 1  thrpt   10   202.000
ms
SchedulingBenchmarks.PreemptorSlotSearchBenchmark.runBenchmark:·stack   
 1  thrptNaN   
---
SchedulingBenchmarks.PreemptorSlotSearchBenchmark.runBenchmark  
10  thrpt   1052.161 ±   8.526   
ops/s
SchedulingBenchmarks.PreemptorSlotSearchBenchmark.runBenchmark:·gc.alloc.rate   
10  thrpt   10   550.771 ±  89.939  
MB/sec
SchedulingBenchmarks.PreemptorSlotSearchBenchmark.runBenchmark:·gc.alloc.rate.norm
  10  thrpt   10  11074211.352 ± 318.376B/op
SchedulingBenchmarks.PreemptorSlotSearchBenchmark.runBenchmark:·gc.churn.PS_Eden_Space
  10  thrpt   10   550.125 ± 107.470  MB/sec
SchedulingBenchmarks.PreemptorSlotSearchBenchmark.runBenchmark:·gc.churn.PS_Eden_Space.norm
 10  thrpt   10  11073792.311 ± 1621636.993B/op
SchedulingBenchmarks.PreemptorSlotSearchBenchmark.runBenchmark:·gc.churn.PS_Survivor_Space
  10  thrpt   10 0.038 ±   0.049  MB/sec
SchedulingBenchmarks.PreemptorSlotSearchBenchmark.runBenchmark:·gc.churn.PS_Survivor_Space.norm
 10  thrpt   10   737.753 ± 919.460B/op
SchedulingBenchmarks.PreemptorSlotSearchBenchmark.runBenchmark:·gc.count
10  thrpt   1055.000
counts
SchedulingBenchmarks.PreemptorSlotSearchBenchmark.runBenchmark:·gc.time 
10  thrpt   10   155.000
ms
SchedulingBenchmarks.PreemptorSlotSearchBenchmark.runBenchmark:·stack   
10  thrptNaN   
---
SchedulingBenchmarks.PreemptorSlotSearchBenchmark.runBenchmark  
   100  thrpt   1055

Re: Review Request 66190: Fix 'PreemptorSlotSearchBenchmark', remove 'isProduction' references

2018-03-20 Thread Jordan Ly

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/66190/#review199617
---




src/jmh/java/org/apache/aurora/benchmark/BenchmarkSettings.java
Lines 72-74 (original)


Reviewers note: removed this option as it was a bit confusing. However, I 
may lack context on it. Is there any reason to have this? I would imagine we 
could set `victimClusterUtilization` to 0 to avoid preemptions and achieve the 
same effect.



src/jmh/java/org/apache/aurora/benchmark/SchedulingBenchmarks.java
Lines 386-387 (original), 386-387 (patched)


Changed these values to be a bit forward facing. I plan on adding a patch 
that reduces the number of hosts we look at when doing preemptions.


- Jordan Ly


On March 21, 2018, 12:04 a.m., Jordan Ly wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/66190/
> ---
> 
> (Updated March 21, 2018, 12:04 a.m.)
> 
> 
> Review request for Aurora, David McLaughlin, Santhosh Kumar Shanmugham, and 
> Stephan Erb.
> 
> 
> Repository: aurora
> 
> 
> Description
> ---
> 
> This benchmark was using the deprecated `production` flag when building the 
> tasks for the cluster. `PendingTaskProcessor` depends on `tier` instead, so 
> this benchmark ended up not testing the correct codepath.
> 
> Removed references to `production` and added `tier` instead. Additionally, 
> removed some unused options.
> 
> 
> Diffs
> -
> 
>   src/jmh/java/org/apache/aurora/benchmark/BenchmarkSettings.java 
> ddab2eccb22a93ecb67481f399707d2d82df5db2 
>   src/jmh/java/org/apache/aurora/benchmark/SchedulingBenchmarks.java 
> 1f9a5764b502f939f0345ff99fb0fc2830b4c2f0 
>   src/jmh/java/org/apache/aurora/benchmark/Tasks.java 
> 60c62bbf3061650a5dd8654045dc8189293d0190 
> 
> 
> Diff: https://reviews.apache.org/r/66190/diff/1/
> 
> 
> Testing
> ---
> 
> Old:
> ```
> # Run complete. Total time: 00:08:32
> 
> Benchmark 
>(numPendingTasks)   Mode  Cnt Score Error  
>  Units
> SchedulingBenchmarks.PreemptorSlotSearchBenchmark.runBenchmark
>1  thrpt   1057.670 ±  20.451  
>  ops/s
> SchedulingBenchmarks.PreemptorSlotSearchBenchmark.runBenchmark:·gc.alloc.rate 
>1  thrpt   10   595.374 ± 211.805  
> MB/sec
> SchedulingBenchmarks.PreemptorSlotSearchBenchmark.runBenchmark:·gc.alloc.rate.norm
>1  thrpt   10  10830342.916 ± 380.919
> B/op
> SchedulingBenchmarks.PreemptorSlotSearchBenchmark.runBenchmark:·gc.churn.PS_Eden_Space
>1  thrpt   10   593.530 ± 222.002  MB/sec
> SchedulingBenchmarks.PreemptorSlotSearchBenchmark.runBenchmark:·gc.churn.PS_Eden_Space.norm
>   1  thrpt   10  10717947.102 ± 1280229.296B/op
> SchedulingBenchmarks.PreemptorSlotSearchBenchmark.runBenchmark:·gc.churn.PS_Survivor_Space
>1  thrpt   10 0.305 ±   1.264  MB/sec
> SchedulingBenchmarks.PreemptorSlotSearchBenchmark.runBenchmark:·gc.churn.PS_Survivor_Space.norm
>   1  thrpt   10 13552.434 ±   61403.918B/op
> SchedulingBenchmarks.PreemptorSlotSearchBenchmark.runBenchmark:·gc.count  
>1  thrpt   1060.000
> counts
> SchedulingBenchmarks.PreemptorSlotSearchBenchmark.runBenchmark:·gc.time   
>1  thrpt   10   202.000
> ms
> SchedulingBenchmarks.PreemptorSlotSearchBenchmark.runBenchmark:·stack 
>1  thrptNaN
>---
> SchedulingBenchmarks.PreemptorSlotSearchBenchmark.runBenchmark
>   10  thrpt   1052.161 ±   8.526  
>  ops/s
> SchedulingBenchmarks.PreemptorSlotSearchBenchmark.runBenchmark:·gc.alloc.rate 
>   10  thrpt   10   550.771 ±  89.939  
> MB/sec
> SchedulingBenchmarks.PreemptorSlotSearchBenchmark.runBenchmark:·gc.alloc.rate.norm
>   10  thrpt   10  11074211.352 ± 318.376
> B/op
> SchedulingBenchmarks.PreemptorSlotSearchBenchmark.runBenchmark:·gc.churn.PS_Eden_Space
>   10  thrpt   10   550.125 ± 107.470  MB/sec
> SchedulingBenchmarks.PreemptorSlotSearchBenchmark.runBenchmark:·gc.churn.PS_Eden_Space.norm
>  10  thrpt   10  11073792.311 ± 1621636.993B/op
> SchedulingBenchmarks.P

Re: Review Request 66154: Adding support for using custom executors via the Aurora DSL

2018-03-20 Thread Aurora ReviewBot

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/66154/#review199618
---


Ship it!




Master (b3fa9fe) is green with this patch.
  ./build-support/jenkins/build.sh

I will refresh this build result if you post a review containing "@ReviewBot 
retry"

- Aurora ReviewBot


On March 20, 2018, 11:24 p.m., Renan DelValle wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/66154/
> ---
> 
> (Updated March 20, 2018, 11:24 p.m.)
> 
> 
> Review request for Aurora, Jordan Ly, Santhosh Kumar Shanmugham, and Stephan 
> Erb.
> 
> 
> Bugs: AURORA-1981
> https://issues.apache.org/jira/browse/AURORA-1981
> 
> 
> Repository: aurora
> 
> 
> Description
> ---
> 
> Adding executor_config as a field to MesosJob in order to allow the use of 
> custom executors through the DSL.
> 
> First patch in a series to catch up on some work I have backlogged regarding 
> custom executors.
> 
> 
> Diffs
> -
> 
>   RELEASE-NOTES.md b5d06c42badd3b623ad365a5d34a7162e8c782f2 
>   docs/features/custom-executors.md 1357c1ed334dbd90ac9937c557766ce31686c293 
>   docs/reference/configuration.md 725e073e8057f0e5e17f50285d86c89638c96d16 
>   src/main/python/apache/aurora/config/schema/base.py 
> 3d57d6a68b9553938c8ef8a15e6ac88d466f7c3c 
>   src/main/python/apache/aurora/config/thrift.py 
> dcabb03500c8e60d5585f0e65c74e163bd93e43a 
>   src/test/python/apache/aurora/client/cli/test_inspect.py 
> 8c624800eed383383dfb07c10aae103c54875728 
>   src/test/python/apache/aurora/config/test_thrift.py 
> 7bf050836633780401c5d8a371e006bde025c420 
> 
> 
> Diff: https://reviews.apache.org/r/66154/diff/2/
> 
> 
> Testing
> ---
> 
> ./build-support/jenkins/build.sh
> 
> ./src/test/sh/org/apache/aurora/e2e/test_end_to_end.sh
> 
> 
> Thanks,
> 
> Renan DelValle
> 
>



Re: Review Request 66154: Adding support for using custom executors via the Aurora DSL

2018-03-20 Thread Santhosh Kumar Shanmugham

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/66154/#review199509
---


Fix it, then Ship it!




LGTM. +1 to adding an e2e test.


RELEASE-NOTES.md
Lines 7 (patched)


s/---/-/g



docs/features/custom-executors.md
Line 151 (original), 151 (patched)


Link to 
http://aurora.apache.org/documentation/latest/features/custom-executors/ ?


- Santhosh Kumar Shanmugham


On March 20, 2018, 4:24 p.m., Renan DelValle wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/66154/
> ---
> 
> (Updated March 20, 2018, 4:24 p.m.)
> 
> 
> Review request for Aurora, Jordan Ly, Santhosh Kumar Shanmugham, and Stephan 
> Erb.
> 
> 
> Bugs: AURORA-1981
> https://issues.apache.org/jira/browse/AURORA-1981
> 
> 
> Repository: aurora
> 
> 
> Description
> ---
> 
> Adding executor_config as a field to MesosJob in order to allow the use of 
> custom executors through the DSL.
> 
> First patch in a series to catch up on some work I have backlogged regarding 
> custom executors.
> 
> 
> Diffs
> -
> 
>   RELEASE-NOTES.md b5d06c42badd3b623ad365a5d34a7162e8c782f2 
>   docs/features/custom-executors.md 1357c1ed334dbd90ac9937c557766ce31686c293 
>   docs/reference/configuration.md 725e073e8057f0e5e17f50285d86c89638c96d16 
>   src/main/python/apache/aurora/config/schema/base.py 
> 3d57d6a68b9553938c8ef8a15e6ac88d466f7c3c 
>   src/main/python/apache/aurora/config/thrift.py 
> dcabb03500c8e60d5585f0e65c74e163bd93e43a 
>   src/test/python/apache/aurora/client/cli/test_inspect.py 
> 8c624800eed383383dfb07c10aae103c54875728 
>   src/test/python/apache/aurora/config/test_thrift.py 
> 7bf050836633780401c5d8a371e006bde025c420 
> 
> 
> Diff: https://reviews.apache.org/r/66154/diff/2/
> 
> 
> Testing
> ---
> 
> ./build-support/jenkins/build.sh
> 
> ./src/test/sh/org/apache/aurora/e2e/test_end_to_end.sh
> 
> 
> Thanks,
> 
> Renan DelValle
> 
>



Review Request 66190: Fix 'PreemptorSlotSearchBenchmark', remove 'isProduction' references

2018-03-20 Thread Jordan Ly

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/66190/
---

Review request for Aurora, David McLaughlin, Santhosh Kumar Shanmugham, and 
Stephan Erb.


Repository: aurora


Description
---

This benchmark was using the deprecated `production` flag when building the 
tasks for the cluster. `PendingTaskProcessor` depends on `tier` instead, so 
this benchmark ended up not testing the correct codepath.

Removed references to `production` and added `tier` instead. Additionally, 
removed some unused options.


Diffs
-

  src/jmh/java/org/apache/aurora/benchmark/BenchmarkSettings.java 
ddab2eccb22a93ecb67481f399707d2d82df5db2 
  src/jmh/java/org/apache/aurora/benchmark/SchedulingBenchmarks.java 
1f9a5764b502f939f0345ff99fb0fc2830b4c2f0 
  src/jmh/java/org/apache/aurora/benchmark/Tasks.java 
60c62bbf3061650a5dd8654045dc8189293d0190 


Diff: https://reviews.apache.org/r/66190/diff/1/


Testing
---

Old:
```
# Run complete. Total time: 00:08:32

Benchmark   
 (numPendingTasks)   Mode  Cnt Score Error   
Units
SchedulingBenchmarks.PreemptorSlotSearchBenchmark.runBenchmark  
 1  thrpt   1057.670 ±  20.451   
ops/s
SchedulingBenchmarks.PreemptorSlotSearchBenchmark.runBenchmark:·gc.alloc.rate   
 1  thrpt   10   595.374 ± 211.805  
MB/sec
SchedulingBenchmarks.PreemptorSlotSearchBenchmark.runBenchmark:·gc.alloc.rate.norm
   1  thrpt   10  10830342.916 ± 380.919B/op
SchedulingBenchmarks.PreemptorSlotSearchBenchmark.runBenchmark:·gc.churn.PS_Eden_Space
   1  thrpt   10   593.530 ± 222.002  MB/sec
SchedulingBenchmarks.PreemptorSlotSearchBenchmark.runBenchmark:·gc.churn.PS_Eden_Space.norm
  1  thrpt   10  10717947.102 ± 1280229.296B/op
SchedulingBenchmarks.PreemptorSlotSearchBenchmark.runBenchmark:·gc.churn.PS_Survivor_Space
   1  thrpt   10 0.305 ±   1.264  MB/sec
SchedulingBenchmarks.PreemptorSlotSearchBenchmark.runBenchmark:·gc.churn.PS_Survivor_Space.norm
  1  thrpt   10 13552.434 ±   61403.918B/op
SchedulingBenchmarks.PreemptorSlotSearchBenchmark.runBenchmark:·gc.count
 1  thrpt   1060.000
counts
SchedulingBenchmarks.PreemptorSlotSearchBenchmark.runBenchmark:·gc.time 
 1  thrpt   10   202.000
ms
SchedulingBenchmarks.PreemptorSlotSearchBenchmark.runBenchmark:·stack   
 1  thrptNaN   
---
SchedulingBenchmarks.PreemptorSlotSearchBenchmark.runBenchmark  
10  thrpt   1052.161 ±   8.526   
ops/s
SchedulingBenchmarks.PreemptorSlotSearchBenchmark.runBenchmark:·gc.alloc.rate   
10  thrpt   10   550.771 ±  89.939  
MB/sec
SchedulingBenchmarks.PreemptorSlotSearchBenchmark.runBenchmark:·gc.alloc.rate.norm
  10  thrpt   10  11074211.352 ± 318.376B/op
SchedulingBenchmarks.PreemptorSlotSearchBenchmark.runBenchmark:·gc.churn.PS_Eden_Space
  10  thrpt   10   550.125 ± 107.470  MB/sec
SchedulingBenchmarks.PreemptorSlotSearchBenchmark.runBenchmark:·gc.churn.PS_Eden_Space.norm
 10  thrpt   10  11073792.311 ± 1621636.993B/op
SchedulingBenchmarks.PreemptorSlotSearchBenchmark.runBenchmark:·gc.churn.PS_Survivor_Space
  10  thrpt   10 0.038 ±   0.049  MB/sec
SchedulingBenchmarks.PreemptorSlotSearchBenchmark.runBenchmark:·gc.churn.PS_Survivor_Space.norm
 10  thrpt   10   737.753 ± 919.460B/op
SchedulingBenchmarks.PreemptorSlotSearchBenchmark.runBenchmark:·gc.count
10  thrpt   1055.000
counts
SchedulingBenchmarks.PreemptorSlotSearchBenchmark.runBenchmark:·gc.time 
10  thrpt   10   155.000
ms
SchedulingBenchmarks.PreemptorSlotSearchBenchmark.runBenchmark:·stack   
10  thrptNaN   
---
SchedulingBenchmarks.PreemptorSlotSearchBenchmark.runBenchmark  
   100  thrpt   1055.574 ±   7.414   
ops/s
SchedulingBenchmarks.PreemptorSlotSearchBenchmark.runBenchmark:·gc.alloc.rate   
   100  thrpt   10   588.906 ±  78.568  
MB/sec
SchedulingBenchmarks.PreemptorSlotSearchBenchmark.runBenchmark:·gc.alloc.rate.norm
   

Re: Review Request 66139: Speedup regular Thermos observer checkpoint refresh

2018-03-20 Thread Aurora ReviewBot

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/66139/#review199612
---


Ship it!




Master (b3fa9fe) is green with this patch.
  ./build-support/jenkins/build.sh

I will refresh this build result if you post a review containing "@ReviewBot 
retry"

- Aurora ReviewBot


On March 20, 2018, 10:41 p.m., Stephan Erb wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/66139/
> ---
> 
> (Updated March 20, 2018, 10:41 p.m.)
> 
> 
> Review request for Aurora, Jordan Ly, Renan DelValle, and Reza Motamedi.
> 
> 
> Repository: aurora
> 
> 
> Description
> ---
> 
> Profiling indicates that a significant part of the refresh time os spend in 
> `os.path.realpath`. 
> This was introduced in https://reviews.apache.org/r/35580/ to properly handle 
> the `latest`
> symlink in the Mesos folder layout. 
> 
> This patch takes a slightly different approach to solve this problem based on 
> `os.path.islink`.
> The latter is faster as it just needs to look at a single folder rather than 
> an entire path.
> 
> 
> Diffs
> -
> 
>   src/main/python/apache/aurora/executor/common/path_detector.py 
> ed264d74ef5a5a7aa681a56b340f9b16504a88ad 
>   src/test/python/apache/aurora/executor/common/test_path_detector.py 
> 7b5ef0cf552d22d4cfbf3357071de036551026dc 
> 
> 
> Diff: https://reviews.apache.org/r/66139/diff/2/
> 
> 
> Testing
> ---
> 
> I have tested this build on a node with 55 running tasks and 2004 finished 
> ones.
> 
> Before this patch:
> 
> D0320 22:20:44.887248 25771 task_observer.py:142] TaskObserver: finished 
> checkpoint refresh in 0.92s
> D0320 22:20:50.746316 25771 task_observer.py:142] TaskObserver: finished 
> checkpoint refresh in 0.93s
> D0320 22:20:56.590157 25771 task_observer.py:142] TaskObserver: finished 
> checkpoint refresh in 0.89s
>  
> With this patch:
> 
> D0320 22:18:53.545236 16250 task_observer.py:142] TaskObserver: finished 
> checkpoint refresh in 0.48s
> D0320 22:18:59.031919 16250 task_observer.py:142] TaskObserver: finished 
> checkpoint refresh in 0.49s
> D0320 22:19:04.512358 16250 task_observer.py:142] TaskObserver: finished 
> checkpoint refresh in 0.48s
> 
> 
> Thanks,
> 
> Stephan Erb
> 
>



Re: Review Request 66186: Upgrade to psutil with optimized Process.children()

2018-03-20 Thread Aurora ReviewBot

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/66186/#review199607
---



Master (b3fa9fe) is red with this patch.
  ./build-support/jenkins/build.sh

 
src/test/python/apache/aurora/client/hooks/test_hooked_api.py::test_api_methods_params[add_instances]
 <- 
.pants.d/pyprep/sources/86d22f98ae41c0e70ab2ce2987a3f6bb0bba61ea/apache/aurora/client/hooks/test_hooked_api.py
 PASSED [ 46%]
 
src/test/python/apache/aurora/client/hooks/test_hooked_api.py::test_api_methods_params[create_job]
 <- 
.pants.d/pyprep/sources/86d22f98ae41c0e70ab2ce2987a3f6bb0bba61ea/apache/aurora/client/hooks/test_hooked_api.py
 PASSED [ 53%]
 
src/test/python/apache/aurora/client/hooks/test_hooked_api.py::test_api_methods_params[kill_job]
 <- 
.pants.d/pyprep/sources/86d22f98ae41c0e70ab2ce2987a3f6bb0bba61ea/apache/aurora/client/hooks/test_hooked_api.py
 PASSED [ 60%]
 
src/test/python/apache/aurora/client/hooks/test_hooked_api.py::test_api_methods_params[restart]
 <- 
.pants.d/pyprep/sources/86d22f98ae41c0e70ab2ce2987a3f6bb0bba61ea/apache/aurora/client/hooks/test_hooked_api.py
 PASSED [ 66%]
 
src/test/python/apache/aurora/client/hooks/test_hooked_api.py::test_api_methods_params[start_cronjob]
 <- 
.pants.d/pyprep/sources/86d22f98ae41c0e70ab2ce2987a3f6bb0bba61ea/apache/aurora/client/hooks/test_hooked_api.py
 PASSED [ 73%]
 
src/test/python/apache/aurora/client/hooks/test_hooked_api.py::test_api_methods_params[start_job_update]
 <- 
.pants.d/pyprep/sources/86d22f98ae41c0e70ab2ce2987a3f6bb0bba61ea/apache/aurora/client/hooks/test_hooked_api.py
 PASSED [ 80%]
 
src/test/python/apache/aurora/client/hooks/test_non_hooked_api.py::TestNonHookedAuroraClientAPI::test_kill_job_discards_config
 <- 
.pants.d/pyprep/sources/86d22f98ae41c0e70ab2ce2987a3f6bb0bba61ea/apache/aurora/client/hooks/test_non_hooked_api.py
 PASSED [ 86%]
 
src/test/python/apache/aurora/client/hooks/test_non_hooked_api.py::TestNonHookedAuroraClientAPI::test_restart_discards_config
 <- 
.pants.d/pyprep/sources/86d22f98ae41c0e70ab2ce2987a3f6bb0bba61ea/apache/aurora/client/hooks/test_non_hooked_api.py
 PASSED [ 93%]
 
src/test/python/apache/aurora/client/hooks/test_non_hooked_api.py::TestNonHookedAuroraClientAPI::test_start_cronjob_discards_config
 <- 
.pants.d/pyprep/sources/86d22f98ae41c0e70ab2ce2987a3f6bb0bba61ea/apache/aurora/client/hooks/test_non_hooked_api.py
 PASSED [100%]
 
  generated xml file: 
/home/jenkins/jenkins-slave/workspace/AuroraBot/.pants.d/test/pytest/src.test.python.apache.aurora.client.hooks.hooks/junitxml/TEST-src.test.python.apache.aurora.client.hooks.hooks.xml
 
 === 15 passed in 0.32 seconds 
 
   src.test.python.apache.aurora.admin.admin
   .   SUCCESS
   src.test.python.apache.aurora.client.client  
   .   SUCCESS
   src.test.python.apache.aurora.client.api.api 
   .   SUCCESS
   src.test.python.apache.aurora.client.cli.cli 
   .   SUCCESS
   src.test.python.apache.aurora.client.docker.docker   
   .   SUCCESS
   src.test.python.apache.aurora.client.hooks.hooks 
   .   SUCCESS
   src.test.python.apache.aurora.common.common  
   .   SUCCESS
   
src.test.python.apache.aurora.common.health_check.health_check  
.   SUCCESS
   src.test.python.apache.aurora.config.config  
   .   SUCCESS
   src.test.python.apache.aurora.executor.executor  
   .   FAILURE
   src.test.python.apache.aurora.executor.bin.bin   
   .   SUCCESS
   src.test.python.apache.aurora.executor.common.common 
   .   SUCCESS
   src.test.python.apache.aurora.tools.tools
   .   SUCCESS
   src.test.python.apache.thermos.cli.cli   
   .   SUCCESS
   src.test.python.apache.thermos.cli.commands.commands 
   .   SUCCESS
   src.test.python.apache.thermos.common.common 
   .   SUCCESS
   src.test.python.apache.thermos.config.config 
   .   SUC

Re: Review Request 66154: Adding support for using custom executors via the Aurora DSL

2018-03-20 Thread Renan DelValle

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/66154/
---

(Updated March 20, 2018, 4:24 p.m.)


Review request for Aurora, Jordan Ly, Santhosh Kumar Shanmugham, and Stephan 
Erb.


Changes
---

Adding check for ExecutorConfig.Data when using Thermos as requested by Reza.


Bugs: AURORA-1981
https://issues.apache.org/jira/browse/AURORA-1981


Repository: aurora


Description
---

Adding executor_config as a field to MesosJob in order to allow the use of 
custom executors through the DSL.

First patch in a series to catch up on some work I have backlogged regarding 
custom executors.


Diffs (updated)
-

  RELEASE-NOTES.md b5d06c42badd3b623ad365a5d34a7162e8c782f2 
  docs/features/custom-executors.md 1357c1ed334dbd90ac9937c557766ce31686c293 
  docs/reference/configuration.md 725e073e8057f0e5e17f50285d86c89638c96d16 
  src/main/python/apache/aurora/config/schema/base.py 
3d57d6a68b9553938c8ef8a15e6ac88d466f7c3c 
  src/main/python/apache/aurora/config/thrift.py 
dcabb03500c8e60d5585f0e65c74e163bd93e43a 
  src/test/python/apache/aurora/client/cli/test_inspect.py 
8c624800eed383383dfb07c10aae103c54875728 
  src/test/python/apache/aurora/config/test_thrift.py 
7bf050836633780401c5d8a371e006bde025c420 


Diff: https://reviews.apache.org/r/66154/diff/2/

Changes: https://reviews.apache.org/r/66154/diff/1-2/


Testing
---

./build-support/jenkins/build.sh

./src/test/sh/org/apache/aurora/e2e/test_end_to_end.sh


Thanks,

Renan DelValle



Re: Review Request 66186: Upgrade to psutil with optimized Process.children()

2018-03-20 Thread Stephan Erb

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/66186/#review199603
---




src/test/python/apache/thermos/monitoring/test_process_collector_psutil.py
Line 22 (original), 22 (patched)


`process_iter` used to be an implementation detail of `Process.children()` 
but got removed in https://github.com/giampaolo/psutil/pull/1185/files


- Stephan Erb


On March 21, 2018, 12:15 a.m., Stephan Erb wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/66186/
> ---
> 
> (Updated March 21, 2018, 12:15 a.m.)
> 
> 
> Review request for Aurora, Reza Motamedi and Santhosh Kumar Shanmugham.
> 
> 
> Repository: aurora
> 
> 
> Description
> ---
> 
> The changelog claims: `Process.children() is 2x faster on UNIX and 2.4x 
> faster on Linux.`
> 
> This is needed for all stats retrieved via `ProcessTreeCollector`. An update 
> therefore
> seems worthwhile. 
> 
> https://github.com/giampaolo/psutil/blob/master/HISTORY.rst
> 
> 
> Diffs
> -
> 
>   3rdparty/python/requirements.txt 4ac242cfa2c1c19cb7447816ab86e748839d3d11 
>   src/test/python/apache/thermos/monitoring/test_process_collector_psutil.py 
> 93ff878be578fa7a63d25b65e7d915790dc9ccc6 
> 
> 
> Diff: https://reviews.apache.org/r/66186/diff/1/
> 
> 
> Testing
> ---
> 
> TODO: I still want to double check this on a real running observer
> 
> 
> Thanks,
> 
> Stephan Erb
> 
>



Review Request 66186: Upgrade to psutil with optimized Process.children()

2018-03-20 Thread Stephan Erb

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/66186/
---

Review request for Aurora, Reza Motamedi and Santhosh Kumar Shanmugham.


Repository: aurora


Description
---

The changelog claims: `Process.children() is 2x faster on UNIX and 2.4x faster 
on Linux.`

This is needed for all stats retrieved via `ProcessTreeCollector`. An update 
therefore
seems worthwhile. 

https://github.com/giampaolo/psutil/blob/master/HISTORY.rst


Diffs
-

  3rdparty/python/requirements.txt 4ac242cfa2c1c19cb7447816ab86e748839d3d11 
  src/test/python/apache/thermos/monitoring/test_process_collector_psutil.py 
93ff878be578fa7a63d25b65e7d915790dc9ccc6 


Diff: https://reviews.apache.org/r/66186/diff/1/


Testing
---

TODO: I still want to double check this on a real running observer


Thanks,

Stephan Erb



Re: Review Request 66139: Speedup regular Thermos observer checkpoint refresh

2018-03-20 Thread Stephan Erb

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/66139/
---

(Updated March 20, 2018, 11:41 p.m.)


Review request for Aurora, Jordan Ly, Renan DelValle, and Reza Motamedi.


Changes
---

Optimize path detection without changing or removing existing code of 
`ExecutorDetector`.


Repository: aurora


Description (updated)
---

Profiling indicates that a significant part of the refresh time os spend in 
`os.path.realpath`. 
This was introduced in https://reviews.apache.org/r/35580/ to properly handle 
the `latest`
symlink in the Mesos folder layout. 

This patch takes a slightly different approach to solve this problem based on 
`os.path.islink`.
The latter is faster as it just needs to look at a single folder rather than an 
entire path.


Diffs (updated)
-

  src/main/python/apache/aurora/executor/common/path_detector.py 
ed264d74ef5a5a7aa681a56b340f9b16504a88ad 
  src/test/python/apache/aurora/executor/common/test_path_detector.py 
7b5ef0cf552d22d4cfbf3357071de036551026dc 


Diff: https://reviews.apache.org/r/66139/diff/2/

Changes: https://reviews.apache.org/r/66139/diff/1-2/


Testing (updated)
---

I have tested this build on a node with 55 running tasks and 2004 finished ones.

Before this patch:

D0320 22:20:44.887248 25771 task_observer.py:142] TaskObserver: finished 
checkpoint refresh in 0.92s
D0320 22:20:50.746316 25771 task_observer.py:142] TaskObserver: finished 
checkpoint refresh in 0.93s
D0320 22:20:56.590157 25771 task_observer.py:142] TaskObserver: finished 
checkpoint refresh in 0.89s
 
With this patch:

D0320 22:18:53.545236 16250 task_observer.py:142] TaskObserver: finished 
checkpoint refresh in 0.48s
D0320 22:18:59.031919 16250 task_observer.py:142] TaskObserver: finished 
checkpoint refresh in 0.49s
D0320 22:19:04.512358 16250 task_observer.py:142] TaskObserver: finished 
checkpoint refresh in 0.48s


Thanks,

Stephan Erb



Re: Review Request 66139: Speedup regular Thermos observer checkpoint refresh

2018-03-20 Thread Stephan Erb


> On March 20, 2018, 9:20 p.m., Reza Motamedi wrote:
> > src/main/python/apache/aurora/executor/common/executor_detector.py
> > Lines 28-85 (original), 26-38 (patched)
> > 
> >
> > We are using the functionalities here. Is the performance tunning 
> > related to this code delete?

Oh interesting. Do you use all removed functionality? 

The performance improvement is only partially related. There are cycles wasted 
in the translation from `path` to the parsing result of `ScanfParser` and back 
again, but the dominant factor is the removed `realpath` in `path_detector.py`.

I should be able to rewrite the patch so that `executor_detector.py` remains 
unchanged.


> On March 20, 2018, 9:20 p.m., Reza Motamedi wrote:
> > src/main/python/apache/aurora/executor/common/path_detector.py
> > Lines 31-33 (original), 31-33 (patched)
> > 
> >
> > Curious if this generator pattern is really useful. Don't we just 
> > consume everything it generates?

We need the result of `os.path.join(path, self._sandbox_path)` as a variable, 
in order to both return it and check via `os.path.exists(path)`. An alternative 
would be to use an explicit for loop here.


- Stephan


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/66139/#review199564
---


On March 20, 2018, 12:22 a.m., Stephan Erb wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/66139/
> ---
> 
> (Updated March 20, 2018, 12:22 a.m.)
> 
> 
> Review request for Aurora, Jordan Ly, Renan DelValle, and Reza Motamedi.
> 
> 
> Repository: aurora
> 
> 
> Description
> ---
> 
> Profiling indicated that roughly 70% of the refresh time was spend in 
> `os.path.realpath`. 
> This was introduced in https://reviews.apache.org/r/35580/ to properly handle 
> the `latest`
> symlink in the Mesos folder layout. 
> 
> This patch takes a slightly different approach to solve this problem based on 
> `os.path.islink`.
> The latter is faster as it just needs to look at a single folder rather than 
> an entire path.
> 
> To facilitate this optimization the patch removes unused functionality from 
> ExecutorDetector.
> Most probably this  was used by former GC executor that got removed a few 
> years ago.
> 
> 
> Diffs
> -
> 
>   src/main/python/apache/aurora/executor/BUILD 
> 486230db34a22ea5dd0f68da911c0afb1afbcac0 
>   src/main/python/apache/aurora/executor/common/executor_detector.py 
> a07bfc34caa5f86153ace8184b061e253c39e92e 
>   src/main/python/apache/aurora/executor/common/path_detector.py 
> ed264d74ef5a5a7aa681a56b340f9b16504a88ad 
>   src/main/python/apache/thermos/monitoring/detector.py 
> 6e5a620e9f49e0960a814f4c1f701a21cc7678fd 
>   src/test/python/apache/aurora/executor/common/test_executor_detector.py 
> d2a948f2e95cbc1723b63462787c4256cc56731d 
>   src/test/python/apache/aurora/executor/common/test_path_detector.py 
> 7b5ef0cf552d22d4cfbf3357071de036551026dc 
> 
> 
> Diff: https://reviews.apache.org/r/66139/diff/1/
> 
> 
> Testing
> ---
> 
> I have tested this build on a node with 75 running tasks and 4500 finished 
> ones.
> 
> Before this patch:
> 
> D0319 09:05:07.219057 14514 task_observer.py:142] TaskObserver: finished 
> checkpoint refresh in 2.26s
> D0319 09:05:14.412610 14514 task_observer.py:142] TaskObserver: finished 
> checkpoint refresh in 2.39s
> D0319 09:05:21.600985 14514 task_observer.py:142] TaskObserver: finished 
> checkpoint refresh in 2.29s
>  
> With this patch:
> 
> D0319 09:00:24.084657 5427 task_observer.py:142] TaskObserver: finished 
> checkpoint refresh in 0.77s
> D0319 09:00:29.869699 5427 task_observer.py:142] TaskObserver: finished 
> checkpoint refresh in 0.78s
> D0319 09:00:35.643407 5427 task_observer.py:142] TaskObserver: finished 
> checkpoint refresh in 0.77s
> 
> The regular 2-3 second freezes when navigating the Thermos UI are now almost 
> gone for me.
> 
> 
> Thanks,
> 
> Stephan Erb
> 
>



Re: Review Request 66154: Adding support for using custom executors via the Aurora DSL

2018-03-20 Thread Reza Motamedi

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/66154/#review199578
---


Ship it!





src/test/python/apache/aurora/config/test_thrift.py
Lines 324 (patched)


Nit. check the `data` as well. The default config has data.


- Reza Motamedi


On March 19, 2018, 11:33 p.m., Renan DelValle wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/66154/
> ---
> 
> (Updated March 19, 2018, 11:33 p.m.)
> 
> 
> Review request for Aurora, Jordan Ly, Santhosh Kumar Shanmugham, and Stephan 
> Erb.
> 
> 
> Bugs: AURORA-1981
> https://issues.apache.org/jira/browse/AURORA-1981
> 
> 
> Repository: aurora
> 
> 
> Description
> ---
> 
> Adding executor_config as a field to MesosJob in order to allow the use of 
> custom executors through the DSL.
> 
> First patch in a series to catch up on some work I have backlogged regarding 
> custom executors.
> 
> 
> Diffs
> -
> 
>   RELEASE-NOTES.md b5d06c42badd3b623ad365a5d34a7162e8c782f2 
>   docs/features/custom-executors.md 1357c1ed334dbd90ac9937c557766ce31686c293 
>   docs/reference/configuration.md 725e073e8057f0e5e17f50285d86c89638c96d16 
>   src/main/python/apache/aurora/config/schema/base.py 
> 3d57d6a68b9553938c8ef8a15e6ac88d466f7c3c 
>   src/main/python/apache/aurora/config/thrift.py 
> dcabb03500c8e60d5585f0e65c74e163bd93e43a 
>   src/test/python/apache/aurora/client/cli/test_inspect.py 
> 8c624800eed383383dfb07c10aae103c54875728 
>   src/test/python/apache/aurora/config/test_thrift.py 
> 7bf050836633780401c5d8a371e006bde025c420 
> 
> 
> Diff: https://reviews.apache.org/r/66154/diff/1/
> 
> 
> Testing
> ---
> 
> ./build-support/jenkins/build.sh
> 
> ./src/test/sh/org/apache/aurora/e2e/test_end_to_end.sh
> 
> 
> Thanks,
> 
> Renan DelValle
> 
>



Re: Review Request 66139: Speedup regular Thermos observer checkpoint refresh

2018-03-20 Thread Reza Motamedi

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/66139/#review199564
---




src/main/python/apache/aurora/executor/common/executor_detector.py
Lines 28-85 (original), 26-38 (patched)


We are using the functionalities here. Is the performance tunning related 
to this code delete?



src/main/python/apache/aurora/executor/common/path_detector.py
Lines 31-33 (original), 31-33 (patched)


Curious if this generator pattern is really useful. Don't we just consume 
everything it generates?


- Reza Motamedi


On March 19, 2018, 11:22 p.m., Stephan Erb wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/66139/
> ---
> 
> (Updated March 19, 2018, 11:22 p.m.)
> 
> 
> Review request for Aurora, Jordan Ly, Renan DelValle, and Reza Motamedi.
> 
> 
> Repository: aurora
> 
> 
> Description
> ---
> 
> Profiling indicated that roughly 70% of the refresh time was spend in 
> `os.path.realpath`. 
> This was introduced in https://reviews.apache.org/r/35580/ to properly handle 
> the `latest`
> symlink in the Mesos folder layout. 
> 
> This patch takes a slightly different approach to solve this problem based on 
> `os.path.islink`.
> The latter is faster as it just needs to look at a single folder rather than 
> an entire path.
> 
> To facilitate this optimization the patch removes unused functionality from 
> ExecutorDetector.
> Most probably this  was used by former GC executor that got removed a few 
> years ago.
> 
> 
> Diffs
> -
> 
>   src/main/python/apache/aurora/executor/BUILD 
> 486230db34a22ea5dd0f68da911c0afb1afbcac0 
>   src/main/python/apache/aurora/executor/common/executor_detector.py 
> a07bfc34caa5f86153ace8184b061e253c39e92e 
>   src/main/python/apache/aurora/executor/common/path_detector.py 
> ed264d74ef5a5a7aa681a56b340f9b16504a88ad 
>   src/main/python/apache/thermos/monitoring/detector.py 
> 6e5a620e9f49e0960a814f4c1f701a21cc7678fd 
>   src/test/python/apache/aurora/executor/common/test_executor_detector.py 
> d2a948f2e95cbc1723b63462787c4256cc56731d 
>   src/test/python/apache/aurora/executor/common/test_path_detector.py 
> 7b5ef0cf552d22d4cfbf3357071de036551026dc 
> 
> 
> Diff: https://reviews.apache.org/r/66139/diff/1/
> 
> 
> Testing
> ---
> 
> I have tested this build on a node with 75 running tasks and 4500 finished 
> ones.
> 
> Before this patch:
> 
> D0319 09:05:07.219057 14514 task_observer.py:142] TaskObserver: finished 
> checkpoint refresh in 2.26s
> D0319 09:05:14.412610 14514 task_observer.py:142] TaskObserver: finished 
> checkpoint refresh in 2.39s
> D0319 09:05:21.600985 14514 task_observer.py:142] TaskObserver: finished 
> checkpoint refresh in 2.29s
>  
> With this patch:
> 
> D0319 09:00:24.084657 5427 task_observer.py:142] TaskObserver: finished 
> checkpoint refresh in 0.77s
> D0319 09:00:29.869699 5427 task_observer.py:142] TaskObserver: finished 
> checkpoint refresh in 0.78s
> D0319 09:00:35.643407 5427 task_observer.py:142] TaskObserver: finished 
> checkpoint refresh in 0.77s
> 
> The regular 2-3 second freezes when navigating the Thermos UI are now almost 
> gone for me.
> 
> 
> Thanks,
> 
> Stephan Erb
> 
>



Re: Review Request 66103: Introduce mesos disk collector

2018-03-20 Thread Aurora ReviewBot

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/66103/#review199560
---


Ship it!




Master (b3fa9fe) is green with this patch.
  ./build-support/jenkins/build.sh

I will refresh this build result if you post a review containing "@ReviewBot 
retry"

- Aurora ReviewBot


On March 20, 2018, 5:20 p.m., Reza Motamedi wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/66103/
> ---
> 
> (Updated March 20, 2018, 5:20 p.m.)
> 
> 
> Review request for Aurora, David McLaughlin, Daniel Knightly, Jordan Ly, 
> Santhosh Kumar Shanmugham, and Stephan Erb.
> 
> 
> Repository: aurora
> 
> 
> Description
> ---
> 
> When disk isolation is enabled in a Mesos agent it calculates the disk usage 
> for each container. 
> Thermos Observer also monitors disk usage using `twitter.common.dirutil`, 
> essentially repeating the work already done by the agent. In practice, we see 
> that disk monitoring is one of the most expensive resource monitoring tasks. 
> For instance, when there are deeply nested directories, the CPU utilization 
> of the observer process can easily reach 1.5 CPUs. It would be ideal if we 
> delegate the disk monitoring task to the agent and do it only once. With this 
> approach, when disk collection has improved in the agent (for instance by 
> implementing XFS isolation), we can simply benefit from it without any code 
> change. Some more information about the problem is provided in AURORA-1918.
> 
> This patch that introduces `MesosDiskCollector` which queries the agent's API 
> endpoint to lookup disk_used_bytes. Note that there is also resource 
> monitoring in thermos executor. Currently, I left the disk collector there to 
> use the `du` implementation. That can be changed in a later patch.
> 
> I modified some vagrant config files including `aurora-executor.service` and 
> `etc_mesos-slave/isolation` for testing. They can be left as is. I included 
> them in this patch to show how this would work e2e.
> 
> 
> Diffs
> -
> 
>   3rdparty/python/requirements.txt 4ac242cfa2c1c19cb7447816ab86e748839d3d11 
>   examples/jobs/hello_world.aurora 5401bfebe753b5e53abd08baeac501144ced9b5a 
>   examples/vagrant/mesos_config/etc_mesos-slave/isolation 
> 1a7028ffc70116b104ef3ad22b7388f637707a0f 
>   examples/vagrant/systemd/thermos.service 
> 01925bcd2ae44f100df511f3c3951c3f5a1a72aa 
>   src/main/python/apache/aurora/tools/thermos_observer.py 
> dd9f0c46ceac9e939b1b763073314161de0ea614 
>   src/main/python/apache/thermos/monitoring/BUILD 
> 65ba7088f65e7baa5d30744736ba456b46a55e86 
>   src/main/python/apache/thermos/monitoring/disk.py 
> 986d33a5000f8d5db15cb639c81f8b1d756ffa05 
>   src/main/python/apache/thermos/monitoring/resource.py 
> adcdc751c03460dc801a18278faa96d6bd64722b 
>   src/main/python/apache/thermos/observer/task_observer.py 
> a6870d48bddf2a2ccede7bb68195f2baae1d0e47 
>   
> src/test/python/apache/aurora/executor/common/test_resource_manager_integration.py
>  fe74bd1d3ecd89fca1b5b2251202cbbc0f24 
>   src/test/python/apache/thermos/monitoring/BUILD 
> 8f2b39336dce6c7b580e6ba0009f60afdcb89179 
>   src/test/python/apache/thermos/monitoring/test_disk.py 
> 362393bfd1facf3198e2d438d0596b16700b72b8 
>   src/test/python/apache/thermos/monitoring/test_resource.py 
> e577e552d4ee1807096a15401851bb9fd95fa426 
> 
> 
> Diff: https://reviews.apache.org/r/66103/diff/3/
> 
> 
> Testing
> ---
> 
> I added unit tests.
> Tested in vagrant and it works as intenced.
> I also built and deployed in our test enviroment. In order to measure 
> imporoved performance I created jobs with nested folders and noticed 
> reduction in CPU utilization of the Observer process, by at least 60%. (1.5 
> CPU cores to 0.4 CPU cores)
> 
> 
> Thanks,
> 
> Reza Motamedi
> 
>



Re: Review Request 66103: Introduce mesos disk collector

2018-03-20 Thread Reza Motamedi

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/66103/
---

(Updated March 20, 2018, 5:20 p.m.)


Review request for Aurora, David McLaughlin, Daniel Knightly, Jordan Ly, 
Santhosh Kumar Shanmugham, and Stephan Erb.


Changes
---

- rebase from master


Repository: aurora


Description
---

When disk isolation is enabled in a Mesos agent it calculates the disk usage 
for each container. 
Thermos Observer also monitors disk usage using `twitter.common.dirutil`, 
essentially repeating the work already done by the agent. In practice, we see 
that disk monitoring is one of the most expensive resource monitoring tasks. 
For instance, when there are deeply nested directories, the CPU utilization of 
the observer process can easily reach 1.5 CPUs. It would be ideal if we 
delegate the disk monitoring task to the agent and do it only once. With this 
approach, when disk collection has improved in the agent (for instance by 
implementing XFS isolation), we can simply benefit from it without any code 
change. Some more information about the problem is provided in AURORA-1918.

This patch that introduces `MesosDiskCollector` which queries the agent's API 
endpoint to lookup disk_used_bytes. Note that there is also resource monitoring 
in thermos executor. Currently, I left the disk collector there to use the `du` 
implementation. That can be changed in a later patch.

I modified some vagrant config files including `aurora-executor.service` and 
`etc_mesos-slave/isolation` for testing. They can be left as is. I included 
them in this patch to show how this would work e2e.


Diffs (updated)
-

  3rdparty/python/requirements.txt 4ac242cfa2c1c19cb7447816ab86e748839d3d11 
  examples/jobs/hello_world.aurora 5401bfebe753b5e53abd08baeac501144ced9b5a 
  examples/vagrant/mesos_config/etc_mesos-slave/isolation 
1a7028ffc70116b104ef3ad22b7388f637707a0f 
  examples/vagrant/systemd/thermos.service 
01925bcd2ae44f100df511f3c3951c3f5a1a72aa 
  src/main/python/apache/aurora/tools/thermos_observer.py 
dd9f0c46ceac9e939b1b763073314161de0ea614 
  src/main/python/apache/thermos/monitoring/BUILD 
65ba7088f65e7baa5d30744736ba456b46a55e86 
  src/main/python/apache/thermos/monitoring/disk.py 
986d33a5000f8d5db15cb639c81f8b1d756ffa05 
  src/main/python/apache/thermos/monitoring/resource.py 
adcdc751c03460dc801a18278faa96d6bd64722b 
  src/main/python/apache/thermos/observer/task_observer.py 
a6870d48bddf2a2ccede7bb68195f2baae1d0e47 
  
src/test/python/apache/aurora/executor/common/test_resource_manager_integration.py
 fe74bd1d3ecd89fca1b5b2251202cbbc0f24 
  src/test/python/apache/thermos/monitoring/BUILD 
8f2b39336dce6c7b580e6ba0009f60afdcb89179 
  src/test/python/apache/thermos/monitoring/test_disk.py 
362393bfd1facf3198e2d438d0596b16700b72b8 
  src/test/python/apache/thermos/monitoring/test_resource.py 
e577e552d4ee1807096a15401851bb9fd95fa426 


Diff: https://reviews.apache.org/r/66103/diff/3/

Changes: https://reviews.apache.org/r/66103/diff/2-3/


Testing
---

I added unit tests.
Tested in vagrant and it works as intenced.
I also built and deployed in our test enviroment. In order to measure imporoved 
performance I created jobs with nested folders and noticed reduction in CPU 
utilization of the Observer process, by at least 60%. (1.5 CPU cores to 0.4 CPU 
cores)


Thanks,

Reza Motamedi



Re: Review Request 66154: Adding support for using custom executors via the Aurora DSL

2018-03-20 Thread Renan DelValle


> On March 19, 2018, 5:41 p.m., David McLaughlin wrote:
> > Not sure how possible it is given the requirement for a completely custom 
> > executor, but e2e coverage would be great.

Definitely possible, it's the end goal of going through my Aurora DSL backlog. 
I will most likely have to use DCE-GO (https://github.com/paypal/dce-go) in an 
e2e test after adding support to the DSL for the Mesos fetcher 
(https://issues.apache.org/jira/browse/AURORA-1982).


- Renan


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/66154/#review199504
---


On March 19, 2018, 4:33 p.m., Renan DelValle wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/66154/
> ---
> 
> (Updated March 19, 2018, 4:33 p.m.)
> 
> 
> Review request for Aurora, Jordan Ly, Santhosh Kumar Shanmugham, and Stephan 
> Erb.
> 
> 
> Bugs: AURORA-1981
> https://issues.apache.org/jira/browse/AURORA-1981
> 
> 
> Repository: aurora
> 
> 
> Description
> ---
> 
> Adding executor_config as a field to MesosJob in order to allow the use of 
> custom executors through the DSL.
> 
> First patch in a series to catch up on some work I have backlogged regarding 
> custom executors.
> 
> 
> Diffs
> -
> 
>   RELEASE-NOTES.md b5d06c42badd3b623ad365a5d34a7162e8c782f2 
>   docs/features/custom-executors.md 1357c1ed334dbd90ac9937c557766ce31686c293 
>   docs/reference/configuration.md 725e073e8057f0e5e17f50285d86c89638c96d16 
>   src/main/python/apache/aurora/config/schema/base.py 
> 3d57d6a68b9553938c8ef8a15e6ac88d466f7c3c 
>   src/main/python/apache/aurora/config/thrift.py 
> dcabb03500c8e60d5585f0e65c74e163bd93e43a 
>   src/test/python/apache/aurora/client/cli/test_inspect.py 
> 8c624800eed383383dfb07c10aae103c54875728 
>   src/test/python/apache/aurora/config/test_thrift.py 
> 7bf050836633780401c5d8a371e006bde025c420 
> 
> 
> Diff: https://reviews.apache.org/r/66154/diff/1/
> 
> 
> Testing
> ---
> 
> ./build-support/jenkins/build.sh
> 
> ./src/test/sh/org/apache/aurora/e2e/test_end_to_end.sh
> 
> 
> Thanks,
> 
> Renan DelValle
> 
>