----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/51765/ -----------------------------------------------------------
(Updated Sept. 14, 2016, 11:17 p.m.) Review request for Aurora, Joshua Cohen, Stephan Erb, and Zameer Manji. Changes ------- Rebased. Repository: aurora Description ------- This is the final part of the `BatchWorker` conversion work that converts `TaskScheduler`. See https://reviews.apache.org/r/51759 for more background on the `BatchWorker`. #####Problem See https://reviews.apache.org/r/51759 #####Remediation Task scheduling is one of the most dominant users of the write lock. It's also one of the heaviest and the most latency-sensitive. As such, the default max batch size is chosen conservatively low (3) and batch items are executed in a blocking way. BTW, attempting to make task scheduling non-blocking resulted in a much worse scheduling performance. The way our `DBTaskStore` is wired, all async activities, including `EventBus` are bound to use a single async `Executor`, which is currently limited at 8 threads [1]. Relying on the same `EventBus` to deliver scheduling completion events resulted in slower scheduling perf as those events were backed up behind all other activities, including tasks status events, reconciliation and etc. Increasing the executor thread pool size to a larger number on the other side, also increased the lock contention defeating the whole purpose of this work. #####Results See https://reviews.apache.org/r/51759 for the lock contention results. https://github.com/apache/aurora/blob/b24619b28c4dbb35188871bacd0091a9e01218e3/src/main/java/org/apache/aurora/scheduler/async/AsyncModule.java#L51-L54 Diffs (updated) ----- src/jmh/java/org/apache/aurora/benchmark/SchedulingBenchmarks.java 9d0d40b82653fb923bed16d06546288a1576c21d src/main/java/org/apache/aurora/scheduler/scheduling/SchedulingModule.java 11e8033438ad0808e446e41bb26b3fa4c04136c7 src/main/java/org/apache/aurora/scheduler/scheduling/TaskGroups.java c044ebe6f72183a67462bbd8e5be983eb592c3e9 src/main/java/org/apache/aurora/scheduler/scheduling/TaskScheduler.java d266f6a25ae2360db2977c43768a19b1f1efe8ff src/test/java/org/apache/aurora/scheduler/http/AbstractJettyTest.java c2ceb4e7685a9301f8014a9183e02fbad65bca26 src/test/java/org/apache/aurora/scheduler/scheduling/TaskGroupsTest.java 95cf25eda0a5bfc0cc4c46d1439ebe9d5359ce79 src/test/java/org/apache/aurora/scheduler/scheduling/TaskSchedulerImplTest.java 72562e6bd9a9860c834e6a9faa094c28600a8fed Diff: https://reviews.apache.org/r/51765/diff/ Testing ------- All types of testing including deploying to test and production clusters. Thanks, Maxim Khutornenko