> On Feb. 15, 2017, 9:40 a.m., David McLaughlin wrote: > > src/main/java/org/apache/aurora/scheduler/pruning/TaskHistoryPruner.java, > > lines 150-151 > > <https://reviews.apache.org/r/56575/diff/5/?file=1634553#file1634553line150> > > > > Can you explain the reasoning behind doing this in a loop per job > > rather than a single loop that queries all tasks? > > Mehrdad Nurolahzade wrote: > That's how I originally did it (see revision 1). The concern expressed by > reviewers (read above) was around the following areas: > 1. We are trying to load/filter all terminal state tasks at once which > might cause heap pressure. Looking at one job at a time might be slow and > inefficient from a throughput standpoint (which is a secondary concern) but > can help with putting less pressure on the heap. > 2. `MemStore` does not have a scheduling status index therefore it has to > sequentially scan all tasks to find matching tasks. > > David McLaughlin wrote: > Were those concerns backed by data? Do we have performance numbers from > before and after that change? It doesn't seem to me like a given that moving > from a single query to N+1 queries leads to less work and less objects being > created. Especially when the tasks are already on heap (DbTaskStore is still > considered experimental and not production ready. We should not optimize for > it). > > For point (2) - this does not prevent us scanning every single task in > the store. It just divides the full scan into multiple queries, and each of > those queries (and subsequent filtering) has the overhead of objects, > streams, sorted copies (yikes) and collections that are used for filtering > inside the task processing loop.
No, I did not verify performance/heap profile of the original version. I'm going to deploy it to a test cluster and see if there is a noticable difference in run-time behavior. I agree on both points regarding `MemTaskStore`. Now, I am seeing value in separating prunable task selection implementations for `MemTaskStore` and `DbTaskStore`. But, that can be addressed in a follow-up ticket. - Mehrdad ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/56575/#review165733 ----------------------------------------------------------- On Feb. 15, 2017, 9:07 a.m., Mehrdad Nurolahzade wrote: > > ----------------------------------------------------------- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/56575/ > ----------------------------------------------------------- > > (Updated Feb. 15, 2017, 9:07 a.m.) > > > Review request for Aurora, David McLaughlin, Santhosh Kumar Shanmugham, and > Stephan Erb. > > > Bugs: AURORA-1837 > https://issues.apache.org/jira/browse/AURORA-1837 > > > Repository: aurora > > > Description > ------- > > This patch addressed efficiency issues in the current implementation of > `TaskHistoryPruner`. The new design is similar to that of > `JobUpdateHistoryPruner`: (a) Instead of registering a `DelayExecutor` run > upon terminal task state transitions, it runs on preconfigured intervals, > finds all terminal state tasks that meet pruning criteria and deletes them. > (b) Makes the initial task history pruning delay configurable so that it does > not hamper scheduler upon start. > > The new design addressed the following two efficiecy problems: > > 1. Upon scheduler restart/failure, the in-memory state of task history > pruning scheduled with `DelayExecutor` is lost. `TaskHistoryPruner` learns > about these dead tasks upon restart when log is replayed. These expired tasks > are picked up by the second call to `executor.execute()` that performs job > level pruning immediately (i.e., without delay). Hence, most task history > pruning happens after scheduler restarts and can severely hamper scheduler > performance (or cause consecutive fail-overs on test clusters when we put > load test on scheduler). > > 2. Expired tasks can be picked up for pruning multiple times. The > asynchronous nature of `BatchWorker` which used to process task deletions > introduces some delay between delete enqueue and delete execution. As a > result, tasks already queued for deletion in a previous evaluation round > might get picked up, evaluated and enqueued for deletion again. This is > evident in `tasks_pruned` metric which reflects numbers much higher than the > actual number of expired tasks deleted. > > > Diffs > ----- > > src/main/java/org/apache/aurora/scheduler/pruning/PruningModule.java > 735199ac1ccccab343c24471890aa330d6635c26 > src/main/java/org/apache/aurora/scheduler/pruning/TaskHistoryPruner.java > f77849498ff23616f1d56d133eb218f837ac3413 > > src/test/java/org/apache/aurora/scheduler/pruning/TaskHistoryPrunerTest.java > 14e4040e0b94e96f77068b41454311fa3bf53573 > > Diff: https://reviews.apache.org/r/56575/diff/ > > > Testing > ------- > > Manual testing under Vagrant > > > Thanks, > > Mehrdad Nurolahzade > >