Re: Review Request 55243: AURORA-1868 Evaluate multiple preemption proposals per round

2017-01-05 Thread Aurora ReviewBot

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/55243/#review160662
---


Ship it!




Master (d4ebb56) is green with this patch.
  ./build-support/jenkins/build.sh

I will refresh this build result if you post a review containing "@ReviewBot 
retry"

- Aurora ReviewBot


On Jan. 6, 2017, 1:17 a.m., Mehrdad Nurolahzade wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/55243/
> ---
> 
> (Updated Jan. 6, 2017, 1:17 a.m.)
> 
> 
> Review request for Aurora, David McLaughlin and Zameer Manji.
> 
> 
> Bugs: AURORA-1868
> https://issues.apache.org/jira/browse/AURORA-1868
> 
> 
> Repository: aurora
> 
> 
> Description
> ---
> 
> `TaskScheduler` makes an attempt to preempt already identified candidates 
> through `Preemptor` when it fails to schedule one or more tasks. However, 
> `Preemptor` currently evaluates only one proposal per invocation. A proposal 
> may get vetoed at this point by scheduling filters. If a proposal fails 
> validation the task group might get penalized by `TaskGroups` to give 
> `PendingTaskProcessor` some time to find new preemption candidates; despite 
> the fact that another proposal may already exist in `slotCache`. This penalty 
> might result in expiration of existing proposals in `slotCache`, hence 
> slowing down the overall preemption process.
> 
> This patch modifies `Preemptor` so that it evaluates all existing preemption 
> proposals before giving up.
> 
> 
> Diffs
> -
> 
>   src/main/java/org/apache/aurora/scheduler/preemptor/Preemptor.java 
> 7d2903a47dacfc35f9e547ccb6c5896efe3e013f 
>   src/test/java/org/apache/aurora/scheduler/preemptor/PreemptorImplTest.java 
> 40c42b1b3aa63797da8dea61732b49155034fea2 
> 
> Diff: https://reviews.apache.org/r/55243/diff/
> 
> 
> Testing
> ---
> 
> ```
> $ ./src/test/sh/org/apache/aurora/e2e/test_end_to_end.sh
> 
> ...
> 
> *** OK (All tests passed) ***
> 
> mesos-master start/running, process 12228
> + RETCODE=0
> + restore_netrc
> + mv /home/vagrant/.netrc.bak /home/vagrant/.netrc
> + true
> Connection to 127.0.0.1 closed.
> 
> real  26m20.055s
> user  0m1.412s
> sys   0m0.705s
> ```
> 
> 
> Thanks,
> 
> Mehrdad Nurolahzade
> 
>



Review Request 55243: AURORA-1868 Evaluate multiple preemption proposals per round

2017-01-05 Thread Mehrdad Nurolahzade

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/55243/
---

Review request for Aurora, David McLaughlin and Zameer Manji.


Bugs: AURORA-1868
https://issues.apache.org/jira/browse/AURORA-1868


Repository: aurora


Description
---

`TaskScheduler` makes an attempt to preempt already identified candidates 
through `Preemptor` when it fails to schedule one or more tasks. However, 
`Preemptor` currently evaluates only one proposal per invocation. A proposal 
may get vetoed at this point by scheduling filters. If a proposal fails 
validation the task group might get penalized by `TaskGroups` to give 
`PendingTaskProcessor` some time to find new preemption candidates; despite the 
fact that another proposal may already exist in `slotCache`. This penalty might 
result in expiration of existing proposals in `slotCache`, hence slowing down 
the overall preemption process.

This patch modifies `Preemptor` so that it evaluates all existing preemption 
proposals before giving up.


Diffs
-

  src/main/java/org/apache/aurora/scheduler/preemptor/Preemptor.java 
7d2903a47dacfc35f9e547ccb6c5896efe3e013f 
  src/test/java/org/apache/aurora/scheduler/preemptor/PreemptorImplTest.java 
40c42b1b3aa63797da8dea61732b49155034fea2 

Diff: https://reviews.apache.org/r/55243/diff/


Testing
---

```
$ ./src/test/sh/org/apache/aurora/e2e/test_end_to_end.sh

...

*** OK (All tests passed) ***

mesos-master start/running, process 12228
+ RETCODE=0
+ restore_netrc
+ mv /home/vagrant/.netrc.bak /home/vagrant/.netrc
+ true
Connection to 127.0.0.1 closed.

real26m20.055s
user0m1.412s
sys 0m0.705s
```


Thanks,

Mehrdad Nurolahzade



Re: Review Request 54957: Add option to not retry api calls to the scheduler.

2017-01-05 Thread Karthik Anantha Padmanabhan


> On Jan. 4, 2017, 11:54 p.m., Karthik Anantha Padmanabhan wrote:
> > I was going to wrap this up - but how do people feel about making the all 
> > endpoints "idempotent" by the following method ? Inlcude an 
> > "idempotency-token" along as part of the HTTP header. This token is locally 
> > cached for, say, an hour. Every request with the same token will return 
> > simply short circuit and not be processed. The retry logic to transport 
> > layer so that we can transparently add the idempotency tokens.
> 
> Santhosh Kumar Shanmugham wrote:
> Can you explain this in detail? Particularly flesh out the `HTTP headers` 
> (aren't we using a Thrift interface while speaking to the Scheduler?) and the 
> `locally cached` (have the Scheduler cache the `idempotency-token` or on the 
> client?) parts.
> 
> I like this approach better, which can fix the issue across all APIs.

https://reviews.apache.org/r/55237/ is a WIP for idempotent requests. Yes we 
still use HTTP as the transport but just stuff the thrift binary in the body of 
a POST request.


- Karthik


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/54957/#review160546
---


On Dec. 22, 2016, 12:06 a.m., Karthik Anantha Padmanabhan wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/54957/
> ---
> 
> (Updated Dec. 22, 2016, 12:06 a.m.)
> 
> 
> Review request for Aurora, David McLaughlin, Joshua Cohen, Santhosh Kumar 
> Shanmugham, and Zameer Manji.
> 
> 
> Repository: aurora
> 
> 
> Description
> ---
> 
> This diff adds an option to not retry api calls to the scheduler. For some of 
> the non-idempotent operations we would like to not automatically retry. This 
> patch makes this functionality available only to the `schedule_backup_now` 
> command.
> 
> If there is consensus, this can be added to all commands as well.
> 
> 
> Diffs
> -
> 
>   src/main/python/apache/aurora/admin/admin.py 
> 070c348d2ca5db1edecf832efd9aa5481bddaa4b 
>   src/main/python/apache/aurora/admin/aurora_admin.py 
> fbebbab8c827b5695042d18770d850e31fc38122 
>   src/main/python/apache/aurora/client/api/__init__.py 
> e1dde638bd1d686269fbcd88cb083a52e7f5dbfc 
>   src/main/python/apache/aurora/client/api/scheduler_client.py 
> 9bbfece012e48e0b1752bbefd25c89e04d312cf6 
>   src/test/python/apache/aurora/client/api/test_scheduler_client.py 
> f6018caa4f431e85a9e9ff203ac3d4b6c33f40ef 
> 
> Diff: https://reviews.apache.org/r/54957/diff/
> 
> 
> Testing
> ---
> 
> * Manuall testing
> * ./build-support/jenkins/build.sh passes
> 
> 
> Thanks,
> 
> Karthik Anantha Padmanabhan
> 
>



Re: Review Request 54957: Add option to not retry api calls to the scheduler.

2017-01-05 Thread Santhosh Kumar Shanmugham


> On Jan. 4, 2017, 3:54 p.m., Karthik Anantha Padmanabhan wrote:
> > I was going to wrap this up - but how do people feel about making the all 
> > endpoints "idempotent" by the following method ? Inlcude an 
> > "idempotency-token" along as part of the HTTP header. This token is locally 
> > cached for, say, an hour. Every request with the same token will return 
> > simply short circuit and not be processed. The retry logic to transport 
> > layer so that we can transparently add the idempotency tokens.

Can you explain this in detail? Particularly flesh out the `HTTP headers` 
(aren't we using a Thrift interface while speaking to the Scheduler?) and the 
`locally cached` (have the Scheduler cache the `idempotency-token` or on the 
client?) parts.

I like this approach better, which can fix the issue across all APIs.


- Santhosh Kumar


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/54957/#review160546
---


On Dec. 21, 2016, 4:06 p.m., Karthik Anantha Padmanabhan wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/54957/
> ---
> 
> (Updated Dec. 21, 2016, 4:06 p.m.)
> 
> 
> Review request for Aurora, David McLaughlin, Joshua Cohen, Santhosh Kumar 
> Shanmugham, and Zameer Manji.
> 
> 
> Repository: aurora
> 
> 
> Description
> ---
> 
> This diff adds an option to not retry api calls to the scheduler. For some of 
> the non-idempotent operations we would like to not automatically retry. This 
> patch makes this functionality available only to the `schedule_backup_now` 
> command.
> 
> If there is consensus, this can be added to all commands as well.
> 
> 
> Diffs
> -
> 
>   src/main/python/apache/aurora/admin/admin.py 
> 070c348d2ca5db1edecf832efd9aa5481bddaa4b 
>   src/main/python/apache/aurora/admin/aurora_admin.py 
> fbebbab8c827b5695042d18770d850e31fc38122 
>   src/main/python/apache/aurora/client/api/__init__.py 
> e1dde638bd1d686269fbcd88cb083a52e7f5dbfc 
>   src/main/python/apache/aurora/client/api/scheduler_client.py 
> 9bbfece012e48e0b1752bbefd25c89e04d312cf6 
>   src/test/python/apache/aurora/client/api/test_scheduler_client.py 
> f6018caa4f431e85a9e9ff203ac3d4b6c33f40ef 
> 
> Diff: https://reviews.apache.org/r/54957/diff/
> 
> 
> Testing
> ---
> 
> * Manuall testing
> * ./build-support/jenkins/build.sh passes
> 
> 
> Thanks,
> 
> Karthik Anantha Padmanabhan
> 
>



Re: Review Request 55217: AURORA-1847 Eliminate sequential scan in MemTaskStore.getJobKeys()

2017-01-05 Thread Mehrdad Nurolahzade


> On Jan. 5, 2017, 11:38 a.m., Dmitriy Shirchenko wrote:
> > src/main/java/org/apache/aurora/scheduler/storage/mem/MemTaskStore.java, 
> > lines 117-126
> > 
> >
> > :nit any reason why you didn't extract second index declration on 
> > 117:120

I initially did but PMD did not like it, so I let it be: 
`src/main/java/org/apache/aurora/scheduler/storage/mem/MemTaskStore.java:99:
Perhaps 'hostIndex' could be replaced by a local variable.
:pmdMain FAILED`


- Mehrdad


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/55217/#review160620
---


On Jan. 5, 2017, 10:59 a.m., Mehrdad Nurolahzade wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/55217/
> ---
> 
> (Updated Jan. 5, 2017, 10:59 a.m.)
> 
> 
> Review request for Aurora, Joshua Cohen and Stephan Erb.
> 
> 
> Bugs: AURORA-1847
> https://issues.apache.org/jira/browse/AURORA-1847
> 
> 
> Repository: aurora
> 
> 
> Description
> ---
> 
> If scheduler is configured to run with the `MemTaskStore` every hit on 
> scheduler landing page (`/scheduler`) causes a call to 
> `MemTaskStore.getJobKeys()` through `ReadOnlyScheduler.getRoleSummary()`.
> 
> The implementation of `MemTaskStore.getJobKeys()` is currently very 
> inefficient as it requires a sequential scan of the task store and mapping to 
> their respective job keys. In Twitter clusters this method is currently 
> taking half a second per call (`mem_storage_get_job_keys`). 
> 
> This patch eliminates the sequential scan and mapping to job key by simply 
> returning an immutable copy of the key set of the existing secondary index 
> `job`.
> 
> 
> Diffs
> -
> 
>   src/jmh/java/org/apache/aurora/benchmark/TaskStoreBenchmarks.java 
> f2f00b92bf901c438e40bbc177e9f5a112be1bbc 
>   src/main/java/org/apache/aurora/scheduler/storage/mem/MemTaskStore.java 
> fc272ddb45be8b2f55f01c3d54f7fa9058202c0b 
> 
> Diff: https://reviews.apache.org/r/55217/diff/
> 
> 
> Testing
> ---
> 
> Using the following modified version of existing 
> `TaskStoreBenchmarks.MemFetchTasksBenchmark` which benchmarks 
> `TaskStore.getJobKeys()`:
> ```java
> public static class MemFetchTasksBenchmark extends 
> AbstractFetchTasksBenchmark {
> @Setup(Level.Trial)
> @Override
> public void setUp() {
>   storage = Guice.createInjector(
>   Modules.combine(
>   DbModule.testModuleWithWorkQueue(PLAIN, Optional.of(new 
> InMemStoresModule(PLAIN))),
>   new AbstractModule() {
> @Override
> protected void configure() {
>   bind(StatsProvider.class).toInstance(new 
> FakeStatsProvider());
>   bind(Clock.class).toInstance(new FakeClock());
> }
>   }))
>   .getInstance(Storage.class);
> 
> }
> 
> @Setup(Level.Iteration)
> public void setUpIteration() {
>   createTasks(numTasks);
> }
> 
> @TearDown(Level.Iteration)
> public void tearDownIteration() {
>   deleteTasks();
> }
> 
> @Benchmark
> public Iterable run() {
>   return storage.read(store -> store.getTaskStore().getJobKeys());
> }
>   }
> ```
> 
> Benchmark results BEFORE patch:
> ```
> Benchmark   (numTasks)   Mode  Cnt
> Score Error  Units
> TaskStoreBenchmarks.MemFetchTasksBenchmark.run   1  thrpt5  
> 572.761 ± 168.865  ops/s
> TaskStoreBenchmarks.MemFetchTasksBenchmark.run   5  thrpt5   
> 80.697 ±   8.516  ops/s
> TaskStoreBenchmarks.MemFetchTasksBenchmark.run  10  thrpt5   
> 25.102 ±   3.244  ops/s
> ```
> 
> Benchmark results AFTER patch:
> ```
> Benchmark   (numTasks)   Mode  Cnt   
> Score   Error  Units
> TaskStoreBenchmarks.MemFetchTasksBenchmark.run   1  thrpt5  
> 327336.998 ± 10207.402  ops/s
> TaskStoreBenchmarks.MemFetchTasksBenchmark.run   5  thrpt5  
> 320506.958 ± 23430.527  ops/s
> TaskStoreBenchmarks.MemFetchTasksBenchmark.run  10  thrpt5  
> 287962.695 ± 51917.245  ops/s
> ```
> 
> 
> Thanks,
> 
> Mehrdad Nurolahzade
> 
>



Re: Review Request 55217: AURORA-1847 Eliminate sequential scan in MemTaskStore.getJobKeys()

2017-01-05 Thread Mehrdad Nurolahzade


> On Jan. 5, 2017, 11:17 a.m., Reza Motamedi wrote:
> > src/jmh/java/org/apache/aurora/benchmark/TaskStoreBenchmarks.java, line 93
> > 
> >
> > newb question here... I thought that FakeClock does not advance by 
> > itself. I am wondering what it the benchamrks are coming from or, if it is 
> > just added to fix a dependency?

Correct, this is a fix for a missing dependency.

Not sure what requires an injection of `Clock` here, perhaps something related 
to stats. Anyways, JMH does not do timing based on Aurora's internal 
abstraction of `Clock`, therefore benchmarks are not affected by the choice of 
`Clock` abstraction.


- Mehrdad


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/55217/#review160616
---


On Jan. 5, 2017, 10:59 a.m., Mehrdad Nurolahzade wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/55217/
> ---
> 
> (Updated Jan. 5, 2017, 10:59 a.m.)
> 
> 
> Review request for Aurora, Joshua Cohen and Stephan Erb.
> 
> 
> Bugs: AURORA-1847
> https://issues.apache.org/jira/browse/AURORA-1847
> 
> 
> Repository: aurora
> 
> 
> Description
> ---
> 
> If scheduler is configured to run with the `MemTaskStore` every hit on 
> scheduler landing page (`/scheduler`) causes a call to 
> `MemTaskStore.getJobKeys()` through `ReadOnlyScheduler.getRoleSummary()`.
> 
> The implementation of `MemTaskStore.getJobKeys()` is currently very 
> inefficient as it requires a sequential scan of the task store and mapping to 
> their respective job keys. In Twitter clusters this method is currently 
> taking half a second per call (`mem_storage_get_job_keys`). 
> 
> This patch eliminates the sequential scan and mapping to job key by simply 
> returning an immutable copy of the key set of the existing secondary index 
> `job`.
> 
> 
> Diffs
> -
> 
>   src/jmh/java/org/apache/aurora/benchmark/TaskStoreBenchmarks.java 
> f2f00b92bf901c438e40bbc177e9f5a112be1bbc 
>   src/main/java/org/apache/aurora/scheduler/storage/mem/MemTaskStore.java 
> fc272ddb45be8b2f55f01c3d54f7fa9058202c0b 
> 
> Diff: https://reviews.apache.org/r/55217/diff/
> 
> 
> Testing
> ---
> 
> Using the following modified version of existing 
> `TaskStoreBenchmarks.MemFetchTasksBenchmark` which benchmarks 
> `TaskStore.getJobKeys()`:
> ```java
> public static class MemFetchTasksBenchmark extends 
> AbstractFetchTasksBenchmark {
> @Setup(Level.Trial)
> @Override
> public void setUp() {
>   storage = Guice.createInjector(
>   Modules.combine(
>   DbModule.testModuleWithWorkQueue(PLAIN, Optional.of(new 
> InMemStoresModule(PLAIN))),
>   new AbstractModule() {
> @Override
> protected void configure() {
>   bind(StatsProvider.class).toInstance(new 
> FakeStatsProvider());
>   bind(Clock.class).toInstance(new FakeClock());
> }
>   }))
>   .getInstance(Storage.class);
> 
> }
> 
> @Setup(Level.Iteration)
> public void setUpIteration() {
>   createTasks(numTasks);
> }
> 
> @TearDown(Level.Iteration)
> public void tearDownIteration() {
>   deleteTasks();
> }
> 
> @Benchmark
> public Iterable run() {
>   return storage.read(store -> store.getTaskStore().getJobKeys());
> }
>   }
> ```
> 
> Benchmark results BEFORE patch:
> ```
> Benchmark   (numTasks)   Mode  Cnt
> Score Error  Units
> TaskStoreBenchmarks.MemFetchTasksBenchmark.run   1  thrpt5  
> 572.761 ± 168.865  ops/s
> TaskStoreBenchmarks.MemFetchTasksBenchmark.run   5  thrpt5   
> 80.697 ±   8.516  ops/s
> TaskStoreBenchmarks.MemFetchTasksBenchmark.run  10  thrpt5   
> 25.102 ±   3.244  ops/s
> ```
> 
> Benchmark results AFTER patch:
> ```
> Benchmark   (numTasks)   Mode  Cnt   
> Score   Error  Units
> TaskStoreBenchmarks.MemFetchTasksBenchmark.run   1  thrpt5  
> 327336.998 ± 10207.402  ops/s
> TaskStoreBenchmarks.MemFetchTasksBenchmark.run   5  thrpt5  
> 320506.958 ± 23430.527  ops/s
> TaskStoreBenchmarks.MemFetchTasksBenchmark.run  10  thrpt5  
> 287962.695 ± 51917.245  ops/s
> ```
> 
> 
> Thanks,
> 
> Mehrdad Nurolahzade
> 
>



Re: Review Request 55217: AURORA-1847 Eliminate sequential scan in MemTaskStore.getJobKeys()

2017-01-05 Thread Reza Motamedi

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/55217/#review160616
---




src/jmh/java/org/apache/aurora/benchmark/TaskStoreBenchmarks.java (line 93)


newb question here... I thought that FakeClock does not advance by itself. 
I am wondering what it the benchamrks are coming from or, if it is just added 
to fix a dependency?


- Reza Motamedi


On Jan. 5, 2017, 6:59 p.m., Mehrdad Nurolahzade wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/55217/
> ---
> 
> (Updated Jan. 5, 2017, 6:59 p.m.)
> 
> 
> Review request for Aurora, Joshua Cohen and Stephan Erb.
> 
> 
> Bugs: AURORA-1847
> https://issues.apache.org/jira/browse/AURORA-1847
> 
> 
> Repository: aurora
> 
> 
> Description
> ---
> 
> If scheduler is configured to run with the `MemTaskStore` every hit on 
> scheduler landing page (`/scheduler`) causes a call to 
> `MemTaskStore.getJobKeys()` through `ReadOnlyScheduler.getRoleSummary()`.
> 
> The implementation of `MemTaskStore.getJobKeys()` is currently very 
> inefficient as it requires a sequential scan of the task store and mapping to 
> their respective job keys. In Twitter clusters this method is currently 
> taking half a second per call (`mem_storage_get_job_keys`). 
> 
> This patch eliminates the sequential scan and mapping to job key by simply 
> returning an immutable copy of the key set of the existing secondary index 
> `job`.
> 
> 
> Diffs
> -
> 
>   src/jmh/java/org/apache/aurora/benchmark/TaskStoreBenchmarks.java 
> f2f00b92bf901c438e40bbc177e9f5a112be1bbc 
>   src/main/java/org/apache/aurora/scheduler/storage/mem/MemTaskStore.java 
> fc272ddb45be8b2f55f01c3d54f7fa9058202c0b 
> 
> Diff: https://reviews.apache.org/r/55217/diff/
> 
> 
> Testing
> ---
> 
> Using the following modified version of existing 
> `TaskStoreBenchmarks.MemFetchTasksBenchmark` which benchmarks 
> `TaskStore.getJobKeys()`:
> ```java
> public static class MemFetchTasksBenchmark extends 
> AbstractFetchTasksBenchmark {
> @Setup(Level.Trial)
> @Override
> public void setUp() {
>   storage = Guice.createInjector(
>   Modules.combine(
>   DbModule.testModuleWithWorkQueue(PLAIN, Optional.of(new 
> InMemStoresModule(PLAIN))),
>   new AbstractModule() {
> @Override
> protected void configure() {
>   bind(StatsProvider.class).toInstance(new 
> FakeStatsProvider());
>   bind(Clock.class).toInstance(new FakeClock());
> }
>   }))
>   .getInstance(Storage.class);
> 
> }
> 
> @Setup(Level.Iteration)
> public void setUpIteration() {
>   createTasks(numTasks);
> }
> 
> @TearDown(Level.Iteration)
> public void tearDownIteration() {
>   deleteTasks();
> }
> 
> @Benchmark
> public Iterable run() {
>   return storage.read(store -> store.getTaskStore().getJobKeys());
> }
>   }
> ```
> 
> Benchmark results BEFORE patch:
> ```
> Benchmark   (numTasks)   Mode  Cnt
> Score Error  Units
> TaskStoreBenchmarks.MemFetchTasksBenchmark.run   1  thrpt5  
> 572.761 ± 168.865  ops/s
> TaskStoreBenchmarks.MemFetchTasksBenchmark.run   5  thrpt5   
> 80.697 ±   8.516  ops/s
> TaskStoreBenchmarks.MemFetchTasksBenchmark.run  10  thrpt5   
> 25.102 ±   3.244  ops/s
> ```
> 
> Benchmark results AFTER patch:
> ```
> Benchmark   (numTasks)   Mode  Cnt   
> Score   Error  Units
> TaskStoreBenchmarks.MemFetchTasksBenchmark.run   1  thrpt5  
> 327336.998 ± 10207.402  ops/s
> TaskStoreBenchmarks.MemFetchTasksBenchmark.run   5  thrpt5  
> 320506.958 ± 23430.527  ops/s
> TaskStoreBenchmarks.MemFetchTasksBenchmark.run  10  thrpt5  
> 287962.695 ± 51917.245  ops/s
> ```
> 
> 
> Thanks,
> 
> Mehrdad Nurolahzade
> 
>



Re: Review Request 55217: AURORA-1847 Eliminate sequential scan in MemTaskStore.getJobKeys()

2017-01-05 Thread Aurora ReviewBot

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/55217/#review160617
---



Master (d4ebb56) is green with this patch.
  ./build-support/jenkins/build.sh

However, it appears that it might lack test coverage.

I will refresh this build result if you post a review containing "@ReviewBot 
retry"

- Aurora ReviewBot


On Jan. 5, 2017, 6:59 p.m., Mehrdad Nurolahzade wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/55217/
> ---
> 
> (Updated Jan. 5, 2017, 6:59 p.m.)
> 
> 
> Review request for Aurora, Joshua Cohen and Stephan Erb.
> 
> 
> Bugs: AURORA-1847
> https://issues.apache.org/jira/browse/AURORA-1847
> 
> 
> Repository: aurora
> 
> 
> Description
> ---
> 
> If scheduler is configured to run with the `MemTaskStore` every hit on 
> scheduler landing page (`/scheduler`) causes a call to 
> `MemTaskStore.getJobKeys()` through `ReadOnlyScheduler.getRoleSummary()`.
> 
> The implementation of `MemTaskStore.getJobKeys()` is currently very 
> inefficient as it requires a sequential scan of the task store and mapping to 
> their respective job keys. In Twitter clusters this method is currently 
> taking half a second per call (`mem_storage_get_job_keys`). 
> 
> This patch eliminates the sequential scan and mapping to job key by simply 
> returning an immutable copy of the key set of the existing secondary index 
> `job`.
> 
> 
> Diffs
> -
> 
>   src/jmh/java/org/apache/aurora/benchmark/TaskStoreBenchmarks.java 
> f2f00b92bf901c438e40bbc177e9f5a112be1bbc 
>   src/main/java/org/apache/aurora/scheduler/storage/mem/MemTaskStore.java 
> fc272ddb45be8b2f55f01c3d54f7fa9058202c0b 
> 
> Diff: https://reviews.apache.org/r/55217/diff/
> 
> 
> Testing
> ---
> 
> Using the following modified version of existing 
> `TaskStoreBenchmarks.MemFetchTasksBenchmark` which benchmarks 
> `TaskStore.getJobKeys()`:
> ```java
> public static class MemFetchTasksBenchmark extends 
> AbstractFetchTasksBenchmark {
> @Setup(Level.Trial)
> @Override
> public void setUp() {
>   storage = Guice.createInjector(
>   Modules.combine(
>   DbModule.testModuleWithWorkQueue(PLAIN, Optional.of(new 
> InMemStoresModule(PLAIN))),
>   new AbstractModule() {
> @Override
> protected void configure() {
>   bind(StatsProvider.class).toInstance(new 
> FakeStatsProvider());
>   bind(Clock.class).toInstance(new FakeClock());
> }
>   }))
>   .getInstance(Storage.class);
> 
> }
> 
> @Setup(Level.Iteration)
> public void setUpIteration() {
>   createTasks(numTasks);
> }
> 
> @TearDown(Level.Iteration)
> public void tearDownIteration() {
>   deleteTasks();
> }
> 
> @Benchmark
> public Iterable run() {
>   return storage.read(store -> store.getTaskStore().getJobKeys());
> }
>   }
> ```
> 
> Benchmark results BEFORE patch:
> ```
> Benchmark   (numTasks)   Mode  Cnt
> Score Error  Units
> TaskStoreBenchmarks.MemFetchTasksBenchmark.run   1  thrpt5  
> 572.761 ± 168.865  ops/s
> TaskStoreBenchmarks.MemFetchTasksBenchmark.run   5  thrpt5   
> 80.697 ±   8.516  ops/s
> TaskStoreBenchmarks.MemFetchTasksBenchmark.run  10  thrpt5   
> 25.102 ±   3.244  ops/s
> ```
> 
> Benchmark results AFTER patch:
> ```
> Benchmark   (numTasks)   Mode  Cnt   
> Score   Error  Units
> TaskStoreBenchmarks.MemFetchTasksBenchmark.run   1  thrpt5  
> 327336.998 ± 10207.402  ops/s
> TaskStoreBenchmarks.MemFetchTasksBenchmark.run   5  thrpt5  
> 320506.958 ± 23430.527  ops/s
> TaskStoreBenchmarks.MemFetchTasksBenchmark.run  10  thrpt5  
> 287962.695 ± 51917.245  ops/s
> ```
> 
> 
> Thanks,
> 
> Mehrdad Nurolahzade
> 
>



Review Request 55217: AURORA-1847 Eliminate sequential scan in MemTaskStore.getJobKeys()

2017-01-05 Thread Mehrdad Nurolahzade

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/55217/
---

Review request for Aurora, Joshua Cohen and Stephan Erb.


Bugs: AURORA-1847
https://issues.apache.org/jira/browse/AURORA-1847


Repository: aurora


Description
---

If scheduler is configured to run with the `MemTaskStore` every hit on 
scheduler landing page (`/scheduler`) causes a call to 
`MemTaskStore.getJobKeys()` through `ReadOnlyScheduler.getRoleSummary()`.

The implementation of `MemTaskStore.getJobKeys()` is currently very inefficient 
as it requires a sequential scan of the task store and mapping to their 
respective job keys. In Twitter clusters this method is currently taking half a 
second per call (`mem_storage_get_job_keys`). 

This patch eliminates the sequential scan and mapping to job key by simply 
returning an immutable copy of the key set of the existing secondary index 
`job`.


Diffs
-

  src/jmh/java/org/apache/aurora/benchmark/TaskStoreBenchmarks.java 
f2f00b92bf901c438e40bbc177e9f5a112be1bbc 
  src/main/java/org/apache/aurora/scheduler/storage/mem/MemTaskStore.java 
fc272ddb45be8b2f55f01c3d54f7fa9058202c0b 

Diff: https://reviews.apache.org/r/55217/diff/


Testing
---

Using the following modified version of existing 
`TaskStoreBenchmarks.MemFetchTasksBenchmark` which benchmarks 
`TaskStore.getJobKeys()`:
```java
public static class MemFetchTasksBenchmark extends AbstractFetchTasksBenchmark {
@Setup(Level.Trial)
@Override
public void setUp() {
  storage = Guice.createInjector(
  Modules.combine(
  DbModule.testModuleWithWorkQueue(PLAIN, Optional.of(new 
InMemStoresModule(PLAIN))),
  new AbstractModule() {
@Override
protected void configure() {
  bind(StatsProvider.class).toInstance(new FakeStatsProvider());
  bind(Clock.class).toInstance(new FakeClock());
}
  }))
  .getInstance(Storage.class);

}

@Setup(Level.Iteration)
public void setUpIteration() {
  createTasks(numTasks);
}

@TearDown(Level.Iteration)
public void tearDownIteration() {
  deleteTasks();
}

@Benchmark
public Iterable run() {
  return storage.read(store -> store.getTaskStore().getJobKeys());
}
  }
```

Benchmark results BEFORE patch:
```
Benchmark   (numTasks)   Mode  CntScore 
Error  Units
TaskStoreBenchmarks.MemFetchTasksBenchmark.run   1  thrpt5  572.761 
± 168.865  ops/s
TaskStoreBenchmarks.MemFetchTasksBenchmark.run   5  thrpt5   80.697 
±   8.516  ops/s
TaskStoreBenchmarks.MemFetchTasksBenchmark.run  10  thrpt5   25.102 
±   3.244  ops/s
```

Benchmark results AFTER patch:
```
Benchmark   (numTasks)   Mode  Cnt   
Score   Error  Units
TaskStoreBenchmarks.MemFetchTasksBenchmark.run   1  thrpt5  
327336.998 ± 10207.402  ops/s
TaskStoreBenchmarks.MemFetchTasksBenchmark.run   5  thrpt5  
320506.958 ± 23430.527  ops/s
TaskStoreBenchmarks.MemFetchTasksBenchmark.run  10  thrpt5  
287962.695 ± 51917.245  ops/s
```


Thanks,

Mehrdad Nurolahzade