Re: Review Request 56575: AURORA-1837 Improve task history pruning

Stephan Erb Sun, 12 Mar 2017 14:38:22 -0700


> On Feb. 15, 2017, 6:40 p.m., David McLaughlin wrote:
> > src/main/java/org/apache/aurora/scheduler/pruning/TaskHistoryPruner.java
> > Lines 134-135 (original), 134-135 (patched)
> > <https://reviews.apache.org/r/56575/diff/5/?file=1634553#file1634553line150>
> >
> >     Can you explain the reasoning behind doing this in a loop per job 
> > rather than a single loop that queries all tasks?
> 
> Mehrdad Nurolahzade wrote:
>     That's how I originally did it (see revision 1). The concern expressed by 
> reviewers (read above) was around the following areas:
>     1. We are trying to load/filter all terminal state tasks at once which 
> might cause heap pressure. Looking at one job at a time might be slow and 
> inefficient from a throughput standpoint (which is a secondary concern) but 
> can help with putting less pressure on the heap.
>     2. `MemStore` does not have a scheduling status index therefore it has to 
> sequentially scan all tasks to find matching tasks.
> 
> David McLaughlin wrote:
>     Were those concerns backed by data? Do we have performance numbers from 
> before and after that change? It doesn't seem to me like a given that moving 
> from a single query to N+1 queries leads to less work and less objects being 
> created. Especially when the tasks are already on heap (DbTaskStore is still 
> considered experimental and not production ready. We should not optimize for 
> it). 
>     
>     For point (2) - this does not prevent us scanning every single task in 
> the store. It just divides the full scan into multiple queries, and each of 
> those queries (and subsequent filtering) has the overhead of objects, 
> streams, sorted copies (yikes) and collections that are used for filtering 
> inside the task processing loop.
> 
> Mehrdad Nurolahzade wrote:
>     No, I did not verify performance/heap profile of the original version. 
> I'm going to deploy it to a test cluster and see if there is a noticable 
> difference in run-time behavior.
>     
>     I agree on both points regarding `MemTaskStore`. Now, I am seeing value 
> in separating prunable task selection implementations for `MemTaskStore` and 
> `DbTaskStore`. But, that can be addressed in a follow-up ticket.


Were you able to run a version of this patch on your scale cluster? Any new 
insights?


- Stephan


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/56575/#review165733
-----------------------------------------------------------


On Feb. 15, 2017, 6:07 p.m., Mehrdad Nurolahzade wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/56575/
> -----------------------------------------------------------
> 
> (Updated Feb. 15, 2017, 6:07 p.m.)
> 
> 
> Review request for Aurora, David McLaughlin, Santhosh Kumar Shanmugham, and 
> Stephan Erb.
> 
> 
> Bugs: AURORA-1837
>     https://issues.apache.org/jira/browse/AURORA-1837
> 
> 
> Repository: aurora
> 
> 
> Description
> -------
> 
> This patch addressed efficiency issues in the current implementation of 
> `TaskHistoryPruner`. The new design is similar to that of 
> `JobUpdateHistoryPruner`: (a) Instead of registering a `DelayExecutor` run 
> upon terminal task state transitions, it runs on preconfigured intervals, 
> finds all terminal state tasks that meet pruning criteria and deletes them. 
> (b) Makes the initial task history pruning delay configurable so that it does 
> not hamper scheduler upon start.
> 
> The new design addressed the following two efficiecy problems:
> 
> 1. Upon scheduler restart/failure, the in-memory state of task history 
> pruning scheduled with `DelayExecutor` is lost. `TaskHistoryPruner` learns 
> about these dead tasks upon restart when log is replayed. These expired tasks 
> are picked up by the second call to `executor.execute()` that performs job 
> level pruning immediately (i.e., without delay). Hence, most task history 
> pruning happens after scheduler restarts and can severely hamper scheduler 
> performance (or cause consecutive fail-overs on test clusters when we put 
> load test on scheduler).
> 
> 2. Expired tasks can be picked up for pruning multiple times. The 
> asynchronous nature of `BatchWorker` which used to process task deletions 
> introduces some delay between delete enqueue and delete execution. As a 
> result, tasks already queued for deletion in a previous evaluation round 
> might get picked up, evaluated and enqueued for deletion again. This is 
> evident in `tasks_pruned` metric which reflects numbers much higher than the 
> actual number of expired tasks deleted.
> 
> 
> Diffs
> -----
> 
>   src/main/java/org/apache/aurora/scheduler/pruning/PruningModule.java 
> 735199ac1ccccab343c24471890aa330d6635c26 
>   src/main/java/org/apache/aurora/scheduler/pruning/TaskHistoryPruner.java 
> f77849498ff23616f1d56d133eb218f837ac3413 
>   
> src/test/java/org/apache/aurora/scheduler/pruning/TaskHistoryPrunerTest.java 
> 14e4040e0b94e96f77068b41454311fa3bf53573 
> 
> 
> Diff: https://reviews.apache.org/r/56575/diff/5/
> 
> 
> Testing
> -------
> 
> Manual testing under Vagrant
> 
> 
> Thanks,
> 
> Mehrdad Nurolahzade
> 
>

Re: Review Request 56575: AURORA-1837 Improve task history pruning

Reply via email to