[jira] [Assigned] (AURORA-1827) Fix SLA percentile calculation
[ https://issues.apache.org/jira/browse/AURORA-1827?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reza Motamedi reassigned AURORA-1827: - Assignee: Reza Motamedi > Fix SLA percentile calculation > --- > > Key: AURORA-1827 > URL: https://issues.apache.org/jira/browse/AURORA-1827 > Project: Aurora > Issue Type: Story >Reporter: Reza Motamedi >Assignee: Reza Motamedi >Priority: Trivial > Labels: newbie, sla > > The calculation of mttX (median-time-to-X) depends on the computation of > percentile values. The current implementation does not behave nicely with a > small sample size. For instance, for a given sample set of {50, 150}, > 50-percentile is reported to be 50. Although, 100 seems a more appropriate > return value. > One solution is to modify `SlaUtil` to perform an extrapolation when the > sample size is small or when the corresponding index to a percentile value is > not an integer. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Assigned] (AURORA-118) Add percentiles to @Timed, or write a new decorator to add percentiles
[ https://issues.apache.org/jira/browse/AURORA-118?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reza Motamedi reassigned AURORA-118: Assignee: Reza Motamedi > Add percentiles to @Timed, or write a new decorator to add percentiles > -- > > Key: AURORA-118 > URL: https://issues.apache.org/jira/browse/AURORA-118 > Project: Aurora > Issue Type: Task > Components: Scheduler >Reporter: Bill Farner >Assignee: Reza Motamedi >Priority: Minor > Labels: newbie > > The @Timed annotation is really nice for 'sprinkling on' instrumentation, but > doesn't expose percentiles. We've seen several areas where a long tail of > slow operations caused major performance issues, so spotting these with > percentiles would be very helpful. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (AURORA-1825) Enable async logging by default
[ https://issues.apache.org/jira/browse/AURORA-1825?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15691758#comment-15691758 ] Zameer Manji commented on AURORA-1825: -- Locally I removed the expensive parts of our logback config with: {noformat} diff --git c/src/main/resources/logback.xml w/src/main/resources/logback.xml index 84c175c..6206806 100644 --- c/src/main/resources/logback.xml +++ w/src/main/resources/logback.xml @@ -23,7 +23,7 @@ limitations under the License. System.err -%.-1level%date{MMdd HH:mm:ss.SSS} [%thread, %class{0}:%line] %message %xThrowable%n +%.-1level%date{MMdd HH:mm:ss.SSS} [%thread] %message %xThrowable%n {noformat} Before: {noformat} Benchmark (numPendingTasks) (numTasksToDelete) Mode Cnt Score Error Units StateManagerBenchmarks.DeleteTasksBenchmark.run N/A 1000 thrpt 10 2.510 ± 0.557 ops/s StateManagerBenchmarks.DeleteTasksBenchmark.run N/A 1 thrpt 10 0.272 ± 0.030 ops/s StateManagerBenchmarks.DeleteTasksBenchmark.run N/A 5 thrpt 10 0.053 ± 0.011 ops/s StateManagerBenchmarks.InsertPendingTasksBenchmark.run 1000 N/A thrpt 10 2.446 ± 0.698 ops/s StateManagerBenchmarks.InsertPendingTasksBenchmark.run 1 N/A thrpt 10 0.246 ± 0.018 ops/s StateManagerBenchmarks.InsertPendingTasksBenchmark.run 5 N/A thrpt 10 0.041 ± 0.006 ops/s {noformat} After: {noformat} Benchmark (numPendingTasks) (numTasksToDelete) Mode Cnt Score Error Units StateManagerBenchmarks.DeleteTasksBenchmark.run N/A 1000 thrpt 10 8.640 ± 1.431 ops/s StateManagerBenchmarks.DeleteTasksBenchmark.run N/A 1 thrpt 10 0.892 ± 0.066 ops/s StateManagerBenchmarks.DeleteTasksBenchmark.run N/A 5 thrpt 10 0.172 ± 0.010 ops/s StateManagerBenchmarks.InsertPendingTasksBenchmark.run 1000 N/A thrpt 10 4.837 ± 1.511 ops/s StateManagerBenchmarks.InsertPendingTasksBenchmark.run 1 N/A thrpt 10 0.510 ± 0.315 ops/s StateManagerBenchmarks.InsertPendingTasksBenchmark.run 5 N/A thrpt 10 0.079 ± 0.052 ops/s {noformat} I picked this benchmark because it logs a lot in the critical path. We could probably fix this problem by removing line number and removing class name with the logger name. The net result would be no line numbers but way faster logging. > Enable async logging by default > --- > > Key: AURORA-1825 > URL: https://issues.apache.org/jira/browse/AURORA-1825 > Project: Aurora > Issue Type: Task >Reporter: Zameer Manji >Assignee: Jing Chen >Priority: Minor > > Based on my experience while working on AURORA-1823 and [~StephanErb]'s work > on logging recently, I think it would be best if we enabled async logging. > For example if one attempts to parallelize the work inside > {{StateManagerImpl}} there isn't much benefit because all of the state > transitions are logged and all of the threads would contend for the lock. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (AURORA-1825) Enable async logging by default
[ https://issues.apache.org/jira/browse/AURORA-1825?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15691616#comment-15691616 ] Mehrdad Nurolahzade commented on AURORA-1825: - Just as a side note: It would be interesting to see benchmarks on logging with expensive reflection based patterns like class name or line number removed. I did not find anything on this on Logback but Log4j documentation, for example, explicitly warns against using such patterns in performance critical systems: [https://logging.apache.org/log4j/1.2/apidocs/org/apache/log4j/PatternLayout.html] {quote}WARNING Generating the caller class information is slow. Thus, use should be avoided unless execution speed is not an issue.{quote} {quote}WARNING Generating caller location information is extremely slow and should be avoided unless execution speed is not an issue.{quote} > Enable async logging by default > --- > > Key: AURORA-1825 > URL: https://issues.apache.org/jira/browse/AURORA-1825 > Project: Aurora > Issue Type: Task >Reporter: Zameer Manji >Assignee: Jing Chen >Priority: Minor > > Based on my experience while working on AURORA-1823 and [~StephanErb]'s work > on logging recently, I think it would be best if we enabled async logging. > For example if one attempts to parallelize the work inside > {{StateManagerImpl}} there isn't much benefit because all of the state > transitions are logged and all of the threads would contend for the lock. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (AURORA-1825) Enable async logging by default
[ https://issues.apache.org/jira/browse/AURORA-1825?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15691562#comment-15691562 ] Stephan Erb commented on AURORA-1825: - I would love to see a benchmark showing that async logging is worthwhile before we go down that route. Asynchronous logging can be a real pain when debugging crashes and spurious bugs. > Enable async logging by default > --- > > Key: AURORA-1825 > URL: https://issues.apache.org/jira/browse/AURORA-1825 > Project: Aurora > Issue Type: Task >Reporter: Zameer Manji >Assignee: Jing Chen >Priority: Minor > > Based on my experience while working on AURORA-1823 and [~StephanErb]'s work > on logging recently, I think it would be best if we enabled async logging. > For example if one attempts to parallelize the work inside > {{StateManagerImpl}} there isn't much benefit because all of the state > transitions are logged and all of the threads would contend for the lock. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Assigned] (AURORA-1825) Enable async logging by default
[ https://issues.apache.org/jira/browse/AURORA-1825?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jing Chen reassigned AURORA-1825: - Assignee: Jing Chen > Enable async logging by default > --- > > Key: AURORA-1825 > URL: https://issues.apache.org/jira/browse/AURORA-1825 > Project: Aurora > Issue Type: Task >Reporter: Zameer Manji >Assignee: Jing Chen >Priority: Minor > > Based on my experience while working on AURORA-1823 and [~StephanErb]'s work > on logging recently, I think it would be best if we enabled async logging. > For example if one attempts to parallelize the work inside > {{StateManagerImpl}} there isn't much benefit because all of the state > transitions are logged and all of the threads would contend for the lock. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (AURORA-1829) Expose stats on preemptor BiCache expirations
Mehrdad Nurolahzade created AURORA-1829: --- Summary: Expose stats on preemptor BiCache expirations Key: AURORA-1829 URL: https://issues.apache.org/jira/browse/AURORA-1829 Project: Aurora Issue Type: Story Components: Scheduler Reporter: Mehrdad Nurolahzade Priority: Minor We are currently collecting stats for the size of preemptor {{BiCache}} ({{reservation_cache_size}}). We could additionally collect cache expiration stats to monitor overall preemption effectiveness. Currently, we have no visibility into whether reservations made by preemption are actually consumed or simply expire. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (AURORA-1827) Fix SLA percentile calculation
[ https://issues.apache.org/jira/browse/AURORA-1827?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15691040#comment-15691040 ] Zameer Manji commented on AURORA-1827: -- I upgraded us to Guava 20. It has a [Quantiles|http://google.github.io/guava/releases/20.0/api/docs/com/google/common/math/Quantiles.html] class and a [Stats|http://google.github.io/guava/releases/20.0/api/docs/com/google/common/math/Stats.html] class that could be very helpful here. > Fix SLA percentile calculation > --- > > Key: AURORA-1827 > URL: https://issues.apache.org/jira/browse/AURORA-1827 > Project: Aurora > Issue Type: Story >Reporter: Reza Motamedi >Priority: Trivial > Labels: newbie, sla > > The calculation of mttX (median-time-to-X) depends on the computation of > percentile values. The current implementation does not behave nicely with a > small sample size. For instance, for a given sample set of {50, 150}, > 50-percentile is reported to be 50. Although, 100 seems a more appropriate > return value. > One solution is to modify `SlaUtil` to perform an extrapolation when the > sample size is small or when the corresponding index to a percentile value is > not an integer. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (AURORA-1828) Expose stats on the number of offers evaluated before a task is assigned
Mehrdad Nurolahzade created AURORA-1828: --- Summary: Expose stats on the number of offers evaluated before a task is assigned Key: AURORA-1828 URL: https://issues.apache.org/jira/browse/AURORA-1828 Project: Aurora Issue Type: Story Components: Scheduler Reporter: Mehrdad Nurolahzade Priority: Minor Expose stats on the number of offers evaluated before a task is assigned by {{TaskAssigner}}. Although the number of invocations of the {{SchedulingFilterImpl.filter()}} method exposes the number of offers examined per unit of time. But, it does not provide us with visibility into how many offers are examined before a task is assigned in {{TaskSchedulerImpl.maybeAssign()}}. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (AURORA-1826) Expose Thrift server request workload stats
[ https://issues.apache.org/jira/browse/AURORA-1826?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mehrdad Nurolahzade updated AURORA-1826: Labels: newbie (was: ) > Expose Thrift server request workload stats > --- > > Key: AURORA-1826 > URL: https://issues.apache.org/jira/browse/AURORA-1826 > Project: Aurora > Issue Type: Story > Components: Scheduler >Reporter: Mehrdad Nurolahzade >Priority: Minor > Labels: newbie > > Current Thrift server stats expose the number and timing of requests received > by the server. However, they fail to reflect the size of the requests. This > is limiting us in having an accurate view of the workload currently handled > by the scheduler. > For example, every call to {{restartShards()}} is recorded as one event > despite the fact that a request might only restart one shard while another > request might seek to restart 1K shards. The request workload can be factored > in to better interpret timing information. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (AURORA-1827) Fix SLA percentile calculation
[ https://issues.apache.org/jira/browse/AURORA-1827?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joshua Cohen updated AURORA-1827: - Labels: newbie sla (was: sla) > Fix SLA percentile calculation > --- > > Key: AURORA-1827 > URL: https://issues.apache.org/jira/browse/AURORA-1827 > Project: Aurora > Issue Type: Story >Reporter: Reza Motamedi >Priority: Trivial > Labels: newbie, sla > > The calculation of mttX (median-time-to-X) depends on the computation of > percentile values. The current implementation does not behave nicely with a > small sample size. For instance, for a given sample set of {50, 150}, > 50-percentile is reported to be 50. Although, 100 seems a more appropriate > return value. > One solution is to modify `SlaUtil` to perform an extrapolation when the > sample size is small or when the corresponding index to a percentile value is > not an integer. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (AURORA-1827) Fix SLA percentile calculation
Reza Motamedi created AURORA-1827: - Summary: Fix SLA percentile calculation Key: AURORA-1827 URL: https://issues.apache.org/jira/browse/AURORA-1827 Project: Aurora Issue Type: Story Reporter: Reza Motamedi Priority: Trivial The calculation of mttX (median-time-to-X) depends on the computation of percentile values. The current implementation does not behave nicely with a small sample size. For instance, for a given sample set of {50, 150}, 50-percentile is reported to be 50. Although, 100 seems a more appropriate return value. One solution is to modify `SlaUtil` to perform an extrapolation when the sample size is small or when the corresponding index to a percentile value is not an integer. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (AURORA-1826) Expose Thrift server request workload stats
Mehrdad Nurolahzade created AURORA-1826: --- Summary: Expose Thrift server request workload stats Key: AURORA-1826 URL: https://issues.apache.org/jira/browse/AURORA-1826 Project: Aurora Issue Type: Story Components: Scheduler Reporter: Mehrdad Nurolahzade Priority: Minor Current Thrift server stats expose the number and timing of requests received by the server. However, they fail to reflect the size of the requests. This is limiting us in having an accurate view of the workload currently handled by the scheduler. For example, every call to {{restartShards()}} is recorded as one event despite the fact that a request might only restart one shard while another request might seek to restart 1K shards. The request workload can be factored in to better interpret timing information. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (AURORA-1780) Offers with unknown resources types to Aurora crash the scheduler
[ https://issues.apache.org/jira/browse/AURORA-1780?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15689949#comment-15689949 ] Stephan Erb commented on AURORA-1780: - Fix is on master. Thanks! {code} commit 4797dfe33ba08183fa9596a46ac8be51a64e08bb Author: Renan DelValle Date: Wed Nov 23 13:08:51 2016 +0100 Filter out calls to fromResource for resources that Aurora does not support yet to avoid crashing Added filters whenever fromResource is called for a Protos.Resource in order to avoid Aurora crashing. Previously only bagFromMesosResources was using the SUPPORTED_RESOURCE filter. Reviewed at https://reviews.apache.org/r/53923/ src/main/java/org/apache/aurora/scheduler/resources/ResourceManager.java | 23 +-- 1 file changed, 17 insertions(+), 6 deletions(-) {code} > Offers with unknown resources types to Aurora crash the scheduler > - > > Key: AURORA-1780 > URL: https://issues.apache.org/jira/browse/AURORA-1780 > Project: Aurora > Issue Type: Bug > Environment: vagrant >Reporter: Renan DelValle >Assignee: Renan DelValle > Fix For: 0.17.0 > > > Taking offers from Agents which have resources that are not known to Aurora > cause the Scheduler to crash. > Steps to reproduce: > {code} > vagrant up > sudo service mesos-slave stop > echo > "cpus(aurora-role):0.5;cpus(*):3.5;mem(aurora-role):1024;disk:2;gpus(*):4;test:200" > | sudo tee /etc/mesos-slave/resources > sudo rm -f /var/lib/mesos/meta/slaves/latest > sudo service mesos-slave start > {code} > Wait around a few moments for the offer to be made to Aurora > {code} > I0922 02:41:57.839 [Thread-19, MesosSchedulerImpl:142] Received notification > of lost agent: value: "cadaf569-171d-42fc-a417-fbd608ea5bab-S0" > I0922 02:42:30.585597 2999 log.cpp:577] Attempting to append 109 bytes to > the log > I0922 02:42:30.585654 2999 coordinator.cpp:348] Coordinator attempting to > write APPEND action at position 4 > I0922 02:42:30.585747 2999 replica.cpp:537] Replica received write request > for position 4 from (10)@192.168.33.7:8083 > I0922 02:42:30.586858 2999 leveldb.cpp:341] Persisting action (125 bytes) to > leveldb took 1.086601ms > I0922 02:42:30.586897 2999 replica.cpp:712] Persisted action at 4 > I0922 02:42:30.587020 2999 replica.cpp:691] Replica received learned notice > for position 4 from @0.0.0.0:0 > I0922 02:42:30.587785 2999 leveldb.cpp:341] Persisting action (127 bytes) to > leveldb took 746999ns > I0922 02:42:30.587805 2999 replica.cpp:712] Persisted action at 4 > I0922 02:42:30.587811 2999 replica.cpp:697] Replica learned APPEND action at > position 4 > I0922 02:42:30.601 [SchedulerImpl-0, OfferManager$OfferManagerImpl:185] > Returning offers for cadaf569-171d-42fc-a417-fbd608ea5bab-S1 for compaction. > Sep 22, 2016 2:42:38 AM > com.google.common.util.concurrent.ServiceManager$ServiceListener failed > SEVERE: Service SlotSizeCounterService [FAILED] has failed in the RUNNING > state. > java.lang.NullPointerException: Unknown Mesos resource: name: "test" > type: SCALAR > scalar { > value: 200.0 > } > role: "*" > at java.util.Objects.requireNonNull(Objects.java:228) > at > org.apache.aurora.scheduler.resources.ResourceType.fromResource(ResourceType.java:355) > at > org.apache.aurora.scheduler.resources.ResourceManager.lambda$static$0(ResourceManager.java:52) > at com.google.common.collect.Iterators$7.computeNext(Iterators.java:675) > at > com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:143) > at > com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:138) > at java.util.Iterator.forEachRemaining(Iterator.java:115) > at > java.util.Spliterators$IteratorSpliterator.forEachRemaining(Spliterators.java:1801) > at java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:481) > at > java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:471) > at > java.util.stream.ReduceOps$ReduceOp.evaluateSequential(ReduceOps.java:708) > at java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234) > at > java.util.stream.ReferencePipeline.collect(ReferencePipeline.java:499) > at > org.apache.aurora.scheduler.resources.ResourceManager.bagFromResources(ResourceManager.java:274) > at > org.apache.aurora.scheduler.resources.ResourceManager.bagFromMesosResources(ResourceManager.java:239) > at > org.apache.aurora.scheduler.stats.AsyncStatsModule$OfferAdapter.get(AsyncStatsModule.java:153) > at > org.apache.aurora.scheduler.stats.SlotSizeCounter.run(SlotSizeCounter.java:168) > at > org.apache.aurora.scheduler.stats.AsyncStatsModule$SlotSizeCounterService.ru
[jira] [Resolved] (AURORA-1780) Offers with unknown resources types to Aurora crash the scheduler
[ https://issues.apache.org/jira/browse/AURORA-1780?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Stephan Erb resolved AURORA-1780. - Resolution: Fixed > Offers with unknown resources types to Aurora crash the scheduler > - > > Key: AURORA-1780 > URL: https://issues.apache.org/jira/browse/AURORA-1780 > Project: Aurora > Issue Type: Bug > Environment: vagrant >Reporter: Renan DelValle >Assignee: Renan DelValle > Fix For: 0.17.0 > > > Taking offers from Agents which have resources that are not known to Aurora > cause the Scheduler to crash. > Steps to reproduce: > {code} > vagrant up > sudo service mesos-slave stop > echo > "cpus(aurora-role):0.5;cpus(*):3.5;mem(aurora-role):1024;disk:2;gpus(*):4;test:200" > | sudo tee /etc/mesos-slave/resources > sudo rm -f /var/lib/mesos/meta/slaves/latest > sudo service mesos-slave start > {code} > Wait around a few moments for the offer to be made to Aurora > {code} > I0922 02:41:57.839 [Thread-19, MesosSchedulerImpl:142] Received notification > of lost agent: value: "cadaf569-171d-42fc-a417-fbd608ea5bab-S0" > I0922 02:42:30.585597 2999 log.cpp:577] Attempting to append 109 bytes to > the log > I0922 02:42:30.585654 2999 coordinator.cpp:348] Coordinator attempting to > write APPEND action at position 4 > I0922 02:42:30.585747 2999 replica.cpp:537] Replica received write request > for position 4 from (10)@192.168.33.7:8083 > I0922 02:42:30.586858 2999 leveldb.cpp:341] Persisting action (125 bytes) to > leveldb took 1.086601ms > I0922 02:42:30.586897 2999 replica.cpp:712] Persisted action at 4 > I0922 02:42:30.587020 2999 replica.cpp:691] Replica received learned notice > for position 4 from @0.0.0.0:0 > I0922 02:42:30.587785 2999 leveldb.cpp:341] Persisting action (127 bytes) to > leveldb took 746999ns > I0922 02:42:30.587805 2999 replica.cpp:712] Persisted action at 4 > I0922 02:42:30.587811 2999 replica.cpp:697] Replica learned APPEND action at > position 4 > I0922 02:42:30.601 [SchedulerImpl-0, OfferManager$OfferManagerImpl:185] > Returning offers for cadaf569-171d-42fc-a417-fbd608ea5bab-S1 for compaction. > Sep 22, 2016 2:42:38 AM > com.google.common.util.concurrent.ServiceManager$ServiceListener failed > SEVERE: Service SlotSizeCounterService [FAILED] has failed in the RUNNING > state. > java.lang.NullPointerException: Unknown Mesos resource: name: "test" > type: SCALAR > scalar { > value: 200.0 > } > role: "*" > at java.util.Objects.requireNonNull(Objects.java:228) > at > org.apache.aurora.scheduler.resources.ResourceType.fromResource(ResourceType.java:355) > at > org.apache.aurora.scheduler.resources.ResourceManager.lambda$static$0(ResourceManager.java:52) > at com.google.common.collect.Iterators$7.computeNext(Iterators.java:675) > at > com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:143) > at > com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:138) > at java.util.Iterator.forEachRemaining(Iterator.java:115) > at > java.util.Spliterators$IteratorSpliterator.forEachRemaining(Spliterators.java:1801) > at java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:481) > at > java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:471) > at > java.util.stream.ReduceOps$ReduceOp.evaluateSequential(ReduceOps.java:708) > at java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234) > at > java.util.stream.ReferencePipeline.collect(ReferencePipeline.java:499) > at > org.apache.aurora.scheduler.resources.ResourceManager.bagFromResources(ResourceManager.java:274) > at > org.apache.aurora.scheduler.resources.ResourceManager.bagFromMesosResources(ResourceManager.java:239) > at > org.apache.aurora.scheduler.stats.AsyncStatsModule$OfferAdapter.get(AsyncStatsModule.java:153) > at > org.apache.aurora.scheduler.stats.SlotSizeCounter.run(SlotSizeCounter.java:168) > at > org.apache.aurora.scheduler.stats.AsyncStatsModule$SlotSizeCounterService.runOneIteration(AsyncStatsModule.java:130) > at > com.google.common.util.concurrent.AbstractScheduledService$ServiceDelegate$Task.run(AbstractScheduledService.java:189) > at com.google.common.util.concurrent.Callables$3.run(Callables.java:100) > at > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) > at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308) > at > java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180) > at > java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294) > at > jav