[ https://issues.apache.org/jira/browse/AURORA-1802?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Stephan Erb updated AURORA-1802: -------------------------------- Fix Version/s: 0.17.0 > AttributeAggregate slows down scheduling of jobs with many instances > -------------------------------------------------------------------- > > Key: AURORA-1802 > URL: https://issues.apache.org/jira/browse/AURORA-1802 > Project: Aurora > Issue Type: Bug > Components: Scheduler > Reporter: Stephan Erb > Fix For: 0.17.0 > > > The current implementation of > [{{AttributeAggregate}}|https://github.com/apache/aurora/blob/f559e930659e25b3d7cacb7b845ebda50d18d66a/src/main/java/org/apache/aurora/scheduler/filter/AttributeAggregate.java] > slows down scheduling of jobs with many instances. Interestingly, this is > currently not visible in our job scheduling benchmark results as it only > affects the benchmark setup time but not the measured part. > {{AttributeAggregate}} relies on {{Suppliers.memoize}} to ensure that it is > only computed once and only when necessary. This has probably been done > because the factory > [{{AttributeAggregate.getJobActiveState}}|https://github.com/apache/aurora/blob/f559e930659e25b3d7cacb7b845ebda50d18d66a/src/main/java/org/apache/aurora/scheduler/filter/AttributeAggregate.java#L56-L91] > is slow. > After some recent changes to schedule multiple task instances per scheduling > round the aggregate is computed in each scheduling round via the call > [{{resourceRequest.getJobState().updateAttributeAggregate(...)}} > |https://github.com/apache/aurora/blob/f559e930659e25b3d7cacb7b845ebda50d18d66a/src/main/java/org/apache/aurora/scheduler/state/TaskAssigner.java#L173] > in {{TaskAssigner}}. This means the expensive factory is called once per > scheduling round. > h3. Potential improvements > * the current factory implementation performs one {{fetchTasks}} query > followed by {{n}} distinct {{getHostAttributes}} queries. This could be > reduced to a single SQL query. > * the aggregate makes heavy use of {{ImmutableMultiset}} even though it is > not immutable any more. There is potential room for improvement here. > * The aggregate uses suppliers to perform a lazy instantiation even though > its current usage is not lazy any more. We can either make the implementation > eager, or ensure that the expensive part is only run when absolutely > necessary. > h3. Proof of concept > * 4 mins 23.407 secs -- total runtime of {{./gradlew jmh > -Pbenchmarks='SchedulingBenchmarks.InsufficientResourcesSchedulingBenchmark'}} > * 2 mins 40.308 secs -- total runtime of {{./gradlew jmh > -Pbenchmarks='SchedulingBenchmarks.InsufficientResourcesSchedulingBenchmark'}} > with [{{resourceRequest.getJobState().updateAttributeAggregate(...)}} > |https://github.com/apache/aurora/blob/f559e930659e25b3d7cacb7b845ebda50d18d66a/src/main/java/org/apache/aurora/scheduler/state/TaskAssigner.java#L173] > commented out. This works as the call is not necessary when only a single > instance is scheduled per scheduling round, as done in the benchmarks. -- This message was sent by Atlassian JIRA (v6.3.4#6332)