[jira] [Commented] (AURORA-1589) Dupe debs are showing up
[ https://issues.apache.org/jira/browse/AURORA-1589?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15111772#comment-15111772 ] John Sirois commented on AURORA-1589: - I'd self-serve on this sort of thing if I had access to the jenkins job configs or if you turned the current job config for aurora-packaging into a single call to a script checked into the aurora-packaging repo for easier future self-serve, normal RB process. > Dupe debs are showing up > > > Key: AURORA-1589 > URL: https://issues.apache.org/jira/browse/AURORA-1589 > Project: Aurora > Issue Type: Story >Reporter: Dmitriy Shirchenko >Assignee: Bill Farner > > Looks like jessie and trusty packages are colliding and trusty (last one to > build) is winning out: > https://builds.apache.org/job/aurora-packaging-nightly/lastSuccessfulBuild/artifact/aurora-packaging/artifacts/ -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (AURORA-1589) Dupe debs are showing up
[ https://issues.apache.org/jira/browse/AURORA-1589?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15111769#comment-15111769 ] John Sirois commented on AURORA-1589: - I think it would be strange. Better would be to ship the artifacts off to svn (https://dist.apache.org/repos/dist/dev/aurora) IMO as more officially served nightlies if we want to support nightlies at all. I'm unfamiliar with any intervening Apache protocols though on nightlies. > Dupe debs are showing up > > > Key: AURORA-1589 > URL: https://issues.apache.org/jira/browse/AURORA-1589 > Project: Aurora > Issue Type: Story >Reporter: Dmitriy Shirchenko >Assignee: Bill Farner > > Looks like jessie and trusty packages are colliding and trusty (last one to > build) is winning out: > https://builds.apache.org/job/aurora-packaging-nightly/lastSuccessfulBuild/artifact/aurora-packaging/artifacts/ -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (AURORA-1590) Make it possible to view task event dates in UTC in the scheduler UI
Joshua Cohen created AURORA-1590: Summary: Make it possible to view task event dates in UTC in the scheduler UI Key: AURORA-1590 URL: https://issues.apache.org/jira/browse/AURORA-1590 Project: Aurora Issue Type: Task Components: UI Reporter: Joshua Cohen Priority: Minor On the update page, if you hover over a local date, we show you a tooltip that contains that date in UTC time. It would be nice if we did this consistently, most notably though, this would be useful for task event dates on the job page. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (AURORA-1589) Dupe debs are showing up
[ https://issues.apache.org/jira/browse/AURORA-1589?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15111765#comment-15111765 ] Bill Farner commented on AURORA-1589: - The clobbering is not happening (yet, anyway). What you see on that page is the result of jenkins configuration "Files to archive": aurora-packaging/artifacts/aurora-ubuntu-trusty/dist/*.deb,aurora-packaging/artifacts/aurora-centos-7/dist/rpmbuild/RPMS/x86_64/*.rpm They probably do step on each other, though, so we will need to see if jenkins can include a path in the archive name. Alternatively - would it be strange to include the dist name in the deb file name? > Dupe debs are showing up > > > Key: AURORA-1589 > URL: https://issues.apache.org/jira/browse/AURORA-1589 > Project: Aurora > Issue Type: Story >Reporter: Dmitriy Shirchenko >Assignee: Bill Farner > > Looks like jessie and trusty packages are colliding and trusty (last one to > build) is winning out: > https://builds.apache.org/job/aurora-packaging-nightly/lastSuccessfulBuild/artifact/aurora-packaging/artifacts/ -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (AURORA-1589) Dupe debs are showing up
[ https://issues.apache.org/jira/browse/AURORA-1589?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15111759#comment-15111759 ] Dmitriy Shirchenko commented on AURORA-1589: cc [~jsirois] > Dupe debs are showing up > > > Key: AURORA-1589 > URL: https://issues.apache.org/jira/browse/AURORA-1589 > Project: Aurora > Issue Type: Story >Reporter: Dmitriy Shirchenko >Assignee: Bill Farner > > Looks like jessie and trusty packages are colliding and trusty (last one to > build) is winning out: > https://builds.apache.org/job/aurora-packaging-nightly/lastSuccessfulBuild/artifact/aurora-packaging/artifacts/ -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Comment Edited] (AURORA-1582) Task History Pruning attempts can fail silently
[ https://issues.apache.org/jira/browse/AURORA-1582?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15111758#comment-15111758 ] Zameer Manji edited comment on AURORA-1582 at 1/22/16 1:39 AM: --- {noformat} commit c89fecbcd93aa6ddcf6af60f2a2cb6315b7a4d19 Author: Zameer Manji Date: Thu Jan 21 17:38:25 2016 -0800 Turn TaskHistoryPruner into a service and trigger shutdown on pruning failure. Task pruning is key to operating a large cluster and failure to prune should trigger shutdown to prevent unbounded growth of storage. This patch turns `TaskHistoryPruner` into a service which propagates failure from failed pruning attempts towards the `ServiceManager`. Also completing a TODO which removes a test for behaviour that is very awkward to test for. Bugs closed: AURORA-1582 Reviewed at https://reviews.apache.org/r/42332/ 3 files changed, 68 insertions(+), 111 deletions(-) {noformat} was (Author: zmanji): commit c89fecbcd93aa6ddcf6af60f2a2cb6315b7a4d19 Author: Zameer Manji Date: Thu Jan 21 17:38:25 2016 -0800 Turn TaskHistoryPruner into a service and trigger shutdown on pruning failure. Task pruning is key to operating a large cluster and failure to prune should trigger shutdown to prevent unbounded growth of storage. This patch turns `TaskHistoryPruner` into a service which propagates failure from failed pruning attempts towards the `ServiceManager`. Also completing a TODO which removes a test for behaviour that is very awkward to test for. Bugs closed: AURORA-1582 Reviewed at https://reviews.apache.org/r/42332/ 3 files changed, 68 insertions(+), 111 deletions(-) > Task History Pruning attempts can fail silently > --- > > Key: AURORA-1582 > URL: https://issues.apache.org/jira/browse/AURORA-1582 > Project: Aurora > Issue Type: Bug >Reporter: Zameer Manji >Assignee: Zameer Manji > > As discovered in AURORA-1580, task history pruning attempts can fail and if > they do fail, they fail silently. The root cause seems to be that > AsyncModule's {{AsyncProcessor}} threads just log the unhandled exception if > it exists: > {noformat} > private static void evaluateResult(Runnable runnable, Throwable throwable, > Logger logger) { > // See java.util.concurrent.ThreadPoolExecutor#afterExecute(Runnable, > Throwable) > // for more details and an implementation example. > if (throwable == null) { > if (runnable instanceof Future) { > try { > Future future = (Future) runnable; > if (future.isDone()) { > future.get(); > } > } catch (InterruptedException ie) { > Thread.currentThread().interrupt(); > } catch (ExecutionException ee) { > logger.error(ee.toString(), ee); > } > } > } else { > logger.error(throwable.toString(), throwable); > } > } > {noformat} > I think instead of silently failing if work on these threads fail, we should > shut down the scheduler, much like how if the preemptor or other guava > service fails we shut down the scheduler. This way the scheduler does not > enter an undefined state and operators are informed of the abnormal behaviour. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (AURORA-1588) Unnecessary HealthCheckConfig endpoint warning
[ https://issues.apache.org/jira/browse/AURORA-1588?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15111756#comment-15111756 ] Maxim Khutornenko commented on AURORA-1588: --- BTW, this seems to be a general problem with out deprecation messaging. E.g. this warning also shows up all the time: {noformat} restart_threshold has been deprecated and will be removed in a future release {noformat} This is due to default values still applied in the config. A possible solution could be dropping defaults where possible. E.g. it's safe to drop default for the {{restart_threshold}} but not so for the HealthCheckConfig values as they are passed into the executor. > Unnecessary HealthCheckConfig endpoint warning > -- > > Key: AURORA-1588 > URL: https://issues.apache.org/jira/browse/AURORA-1588 > Project: Aurora > Issue Type: Bug > Components: Client >Reporter: Maxim Khutornenko >Priority: Minor > Labels: newbie > > The recent [health checker refactoring|https://reviews.apache.org/r/41428] > added a new message: > {noformat} > WARNING: endpoint, expected_response, and expected_response_code are > deprecated and will be removed > in the next release. Please consult updated documentation. > {noformat} > This message always shows up even when users don't explicitly specify > deprecated fields. It should only warn when the .aurora file contains > HealthCheckConfig with any of the deprecated fields. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (AURORA-1589) Dupe debs are showing up
Dmitriy Shirchenko created AURORA-1589: -- Summary: Dupe debs are showing up Key: AURORA-1589 URL: https://issues.apache.org/jira/browse/AURORA-1589 Project: Aurora Issue Type: Story Reporter: Dmitriy Shirchenko Assignee: Bill Farner Looks like jessie and trusty packages are colliding and trusty (last one to build) is winning out: https://builds.apache.org/job/aurora-packaging-nightly/lastSuccessfulBuild/artifact/aurora-packaging/artifacts/ -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (AURORA-1588) Unnecessary HealthCheckConfig endpoint warning
[ https://issues.apache.org/jira/browse/AURORA-1588?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bill Farner updated AURORA-1588: Labels: newbie (was: ) > Unnecessary HealthCheckConfig endpoint warning > -- > > Key: AURORA-1588 > URL: https://issues.apache.org/jira/browse/AURORA-1588 > Project: Aurora > Issue Type: Bug > Components: Client >Reporter: Maxim Khutornenko > Labels: newbie > > The recent [health checker refactoring|https://reviews.apache.org/r/41428] > added a new message: > {noformat} > WARNING: endpoint, expected_response, and expected_response_code are > deprecated and will be removed > in the next release. Please consult updated documentation. > {noformat} > This message always shows up even when users don't explicitly specify > deprecated fields. It should only warn when the .aurora file contains > HealthCheckConfig with any of the deprecated fields. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (AURORA-1588) Unnecessary HealthCheckConfig endpoint warning
[ https://issues.apache.org/jira/browse/AURORA-1588?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bill Farner updated AURORA-1588: Priority: Minor (was: Major) > Unnecessary HealthCheckConfig endpoint warning > -- > > Key: AURORA-1588 > URL: https://issues.apache.org/jira/browse/AURORA-1588 > Project: Aurora > Issue Type: Bug > Components: Client >Reporter: Maxim Khutornenko >Priority: Minor > Labels: newbie > > The recent [health checker refactoring|https://reviews.apache.org/r/41428] > added a new message: > {noformat} > WARNING: endpoint, expected_response, and expected_response_code are > deprecated and will be removed > in the next release. Please consult updated documentation. > {noformat} > This message always shows up even when users don't explicitly specify > deprecated fields. It should only warn when the .aurora file contains > HealthCheckConfig with any of the deprecated fields. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (AURORA-1588) Unnecessary HealthCheckConfig endpoint warning
[ https://issues.apache.org/jira/browse/AURORA-1588?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bill Farner updated AURORA-1588: Summary: Unnecessary HealthCheckConfig endpoint warning (was: HealthCheckConfig endpoint warning shows up unnecessary ) > Unnecessary HealthCheckConfig endpoint warning > -- > > Key: AURORA-1588 > URL: https://issues.apache.org/jira/browse/AURORA-1588 > Project: Aurora > Issue Type: Bug > Components: Client >Reporter: Maxim Khutornenko > > The recent [health checker refactoring|https://reviews.apache.org/r/41428] > added a new message: > {noformat} > WARNING: endpoint, expected_response, and expected_response_code are > deprecated and will be removed > in the next release. Please consult updated documentation. > {noformat} > This message always shows up even when users don't explicitly specify > deprecated fields. It should only warn when the .aurora file contains > HealthCheckConfig with any of the deprecated fields. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (AURORA-1588) HealthCheckConfig endpoint warning shows up unnecessary
Maxim Khutornenko created AURORA-1588: - Summary: HealthCheckConfig endpoint warning shows up unnecessary Key: AURORA-1588 URL: https://issues.apache.org/jira/browse/AURORA-1588 Project: Aurora Issue Type: Bug Components: Client Reporter: Maxim Khutornenko The recent [health checker refactoring|https://reviews.apache.org/r/41428] added a new message: {noformat} WARNING: endpoint, expected_response, and expected_response_code are deprecated and will be removed in the next release. Please consult updated documentation. {noformat} This message always shows up even when users don't explicitly specify deprecated fields. It should only warn when the .aurora file contains HealthCheckConfig with any of the deprecated fields. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (AURORA-1580) java.util.NoSuchElementException from Tasks.getLatestEvent with DbTaskStore
[ https://issues.apache.org/jira/browse/AURORA-1580?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zameer Manji updated AURORA-1580: - Story Points: (was: 8) > java.util.NoSuchElementException from Tasks.getLatestEvent with DbTaskStore > --- > > Key: AURORA-1580 > URL: https://issues.apache.org/jira/browse/AURORA-1580 > Project: Aurora > Issue Type: Bug > Components: Scheduler >Reporter: Zameer Manji >Assignee: Bill Farner > > I have discovered the following exception from a scheduler that is running > off master with the beta task store enabled. > {noformat} > E0113 22:51:55.941 [AsyncProcessor-2, AsyncUtil:123] > java.util.concurrent.ExecutionException: java.util.NoSuchElementException > java.util.concurrent.ExecutionException: java.util.NoSuchElementException > at java.util.concurrent.FutureTask.report(FutureTask.java:122) > ~[na:1.8.0_66-Tw8r9b1] > at java.util.concurrent.FutureTask.get(FutureTask.java:192) > ~[na:1.8.0_66-Tw8r9b1] > at > org.apache.aurora.scheduler.base.AsyncUtil.evaluateResult(AsyncUtil.java:118) > [aurora-110.jar:na] > at > org.apache.aurora.scheduler.base.AsyncUtil.access$000(AsyncUtil.java:32) > [aurora-110.jar:na] > at > org.apache.aurora.scheduler.base.AsyncUtil$1.afterExecute(AsyncUtil.java:59) > [aurora-110.jar:na] > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1150) > [na:1.8.0_66-Tw8r9b1] > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) > [na:1.8.0_66-Tw8r9b1] > at java.lang.Thread.run(Thread.java:745) [na:1.8.0_66-Tw8r9b1] > Caused by: java.util.NoSuchElementException: null > at com.google.common.collect.Iterables.getLast(Iterables.java:784) > ~[guava-19.0.jar:na] > at > org.apache.aurora.scheduler.base.Tasks.getLatestEvent(Tasks.java:149) > ~[aurora-110.jar:na] > at org.apache.aurora.scheduler.base.Tasks$1.apply(Tasks.java:156) > ~[aurora-110.jar:na] > at org.apache.aurora.scheduler.base.Tasks$1.apply(Tasks.java:153) > ~[aurora-110.jar:na] > at > com.google.common.collect.ByFunctionOrdering.compare(ByFunctionOrdering.java:45) > ~[guava-19.0.jar:na] > at java.util.TimSort.binarySort(TimSort.java:296) > ~[na:1.8.0_66-Tw8r9b1] > at java.util.TimSort.sort(TimSort.java:239) ~[na:1.8.0_66-Tw8r9b1] > at java.util.Arrays.sort(Arrays.java:1438) ~[na:1.8.0_66-Tw8r9b1] > at com.google.common.collect.Ordering.sortedCopy(Ordering.java:860) > ~[guava-19.0.jar:na] > at > org.apache.aurora.scheduler.pruning.TaskHistoryPruner.lambda$registerInactiveTask$20(TaskHistoryPruner.java:156) > ~[aurora-110.jar:na] > at > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) > ~[na:1.8.0_66-Tw8r9b1] > at java.util.concurrent.FutureTask.run(FutureTask.java:266) > ~[na:1.8.0_66-Tw8r9b1] > at > java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180) > ~[na:1.8.0_66-Tw8r9b1] > at > java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293) > ~[na:1.8.0_66-Tw8r9b1] > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) > [na:1.8.0_66-Tw8r9b1] > ... 2 common frames omitted > {noformat} > Similar exception occurs within the preemptor and causes the scheduler to > crash: > {noformat} > E0113 01:43:06.242 THREAD5037 > com.google.common.util.concurrent.ServiceManager$ServiceListener.failed: > Service PreemptorService [FAILED] has failed in the RUNNING state. > java.util.NoSuchElementException > at com.google.common.collect.Iterables.getLast(Iterables.java:784) > at > org.apache.aurora.scheduler.base.Tasks.getLatestEvent(Tasks.java:149) > at > org.apache.aurora.scheduler.preemptor.PendingTaskProcessor$3.apply(PendingTaskProcessor.java:237) > at > org.apache.aurora.scheduler.preemptor.PendingTaskProcessor$3.apply(PendingTaskProcessor.java:234) > at > com.google.common.base.Predicates$AndPredicate.apply(Predicates.java:374) > at > com.google.common.collect.Iterators$7.computeNext(Iterators.java:675) > at > com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:143) > at > com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:138) > at > com.google.common.collect.TransformedIterator.hasNext(TransformedIterator.java:43) > at com.google.common.collect.Iterators.addAll(Iterators.java:364) > at com.google.common.collect.Iterables.addAll(Iterables.java:352) > at com.google.common.collect.Ha
[jira] [Updated] (AURORA-1580) java.util.NoSuchElementException from Tasks.getLatestEvent with DbTaskStore
[ https://issues.apache.org/jira/browse/AURORA-1580?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zameer Manji updated AURORA-1580: - Sprint: (was: Twitter Aurora Q1'16 Sprint 17) > java.util.NoSuchElementException from Tasks.getLatestEvent with DbTaskStore > --- > > Key: AURORA-1580 > URL: https://issues.apache.org/jira/browse/AURORA-1580 > Project: Aurora > Issue Type: Bug > Components: Scheduler >Reporter: Zameer Manji >Assignee: Bill Farner > > I have discovered the following exception from a scheduler that is running > off master with the beta task store enabled. > {noformat} > E0113 22:51:55.941 [AsyncProcessor-2, AsyncUtil:123] > java.util.concurrent.ExecutionException: java.util.NoSuchElementException > java.util.concurrent.ExecutionException: java.util.NoSuchElementException > at java.util.concurrent.FutureTask.report(FutureTask.java:122) > ~[na:1.8.0_66-Tw8r9b1] > at java.util.concurrent.FutureTask.get(FutureTask.java:192) > ~[na:1.8.0_66-Tw8r9b1] > at > org.apache.aurora.scheduler.base.AsyncUtil.evaluateResult(AsyncUtil.java:118) > [aurora-110.jar:na] > at > org.apache.aurora.scheduler.base.AsyncUtil.access$000(AsyncUtil.java:32) > [aurora-110.jar:na] > at > org.apache.aurora.scheduler.base.AsyncUtil$1.afterExecute(AsyncUtil.java:59) > [aurora-110.jar:na] > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1150) > [na:1.8.0_66-Tw8r9b1] > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) > [na:1.8.0_66-Tw8r9b1] > at java.lang.Thread.run(Thread.java:745) [na:1.8.0_66-Tw8r9b1] > Caused by: java.util.NoSuchElementException: null > at com.google.common.collect.Iterables.getLast(Iterables.java:784) > ~[guava-19.0.jar:na] > at > org.apache.aurora.scheduler.base.Tasks.getLatestEvent(Tasks.java:149) > ~[aurora-110.jar:na] > at org.apache.aurora.scheduler.base.Tasks$1.apply(Tasks.java:156) > ~[aurora-110.jar:na] > at org.apache.aurora.scheduler.base.Tasks$1.apply(Tasks.java:153) > ~[aurora-110.jar:na] > at > com.google.common.collect.ByFunctionOrdering.compare(ByFunctionOrdering.java:45) > ~[guava-19.0.jar:na] > at java.util.TimSort.binarySort(TimSort.java:296) > ~[na:1.8.0_66-Tw8r9b1] > at java.util.TimSort.sort(TimSort.java:239) ~[na:1.8.0_66-Tw8r9b1] > at java.util.Arrays.sort(Arrays.java:1438) ~[na:1.8.0_66-Tw8r9b1] > at com.google.common.collect.Ordering.sortedCopy(Ordering.java:860) > ~[guava-19.0.jar:na] > at > org.apache.aurora.scheduler.pruning.TaskHistoryPruner.lambda$registerInactiveTask$20(TaskHistoryPruner.java:156) > ~[aurora-110.jar:na] > at > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) > ~[na:1.8.0_66-Tw8r9b1] > at java.util.concurrent.FutureTask.run(FutureTask.java:266) > ~[na:1.8.0_66-Tw8r9b1] > at > java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180) > ~[na:1.8.0_66-Tw8r9b1] > at > java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293) > ~[na:1.8.0_66-Tw8r9b1] > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) > [na:1.8.0_66-Tw8r9b1] > ... 2 common frames omitted > {noformat} > Similar exception occurs within the preemptor and causes the scheduler to > crash: > {noformat} > E0113 01:43:06.242 THREAD5037 > com.google.common.util.concurrent.ServiceManager$ServiceListener.failed: > Service PreemptorService [FAILED] has failed in the RUNNING state. > java.util.NoSuchElementException > at com.google.common.collect.Iterables.getLast(Iterables.java:784) > at > org.apache.aurora.scheduler.base.Tasks.getLatestEvent(Tasks.java:149) > at > org.apache.aurora.scheduler.preemptor.PendingTaskProcessor$3.apply(PendingTaskProcessor.java:237) > at > org.apache.aurora.scheduler.preemptor.PendingTaskProcessor$3.apply(PendingTaskProcessor.java:234) > at > com.google.common.base.Predicates$AndPredicate.apply(Predicates.java:374) > at > com.google.common.collect.Iterators$7.computeNext(Iterators.java:675) > at > com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:143) > at > com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:138) > at > com.google.common.collect.TransformedIterator.hasNext(TransformedIterator.java:43) > at com.google.common.collect.Iterators.addAll(Iterators.java:364) > at com.google.common.collect.Iterables.addAll(Iterables.java:352) > at com.g
[jira] [Commented] (AURORA-1580) java.util.NoSuchElementException from Tasks.getLatestEvent with DbTaskStore
[ https://issues.apache.org/jira/browse/AURORA-1580?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15111514#comment-15111514 ] Bill Farner commented on AURORA-1580: - https://reviews.apache.org/r/42613/ > java.util.NoSuchElementException from Tasks.getLatestEvent with DbTaskStore > --- > > Key: AURORA-1580 > URL: https://issues.apache.org/jira/browse/AURORA-1580 > Project: Aurora > Issue Type: Bug > Components: Scheduler >Reporter: Zameer Manji >Assignee: Bill Farner > > I have discovered the following exception from a scheduler that is running > off master with the beta task store enabled. > {noformat} > E0113 22:51:55.941 [AsyncProcessor-2, AsyncUtil:123] > java.util.concurrent.ExecutionException: java.util.NoSuchElementException > java.util.concurrent.ExecutionException: java.util.NoSuchElementException > at java.util.concurrent.FutureTask.report(FutureTask.java:122) > ~[na:1.8.0_66-Tw8r9b1] > at java.util.concurrent.FutureTask.get(FutureTask.java:192) > ~[na:1.8.0_66-Tw8r9b1] > at > org.apache.aurora.scheduler.base.AsyncUtil.evaluateResult(AsyncUtil.java:118) > [aurora-110.jar:na] > at > org.apache.aurora.scheduler.base.AsyncUtil.access$000(AsyncUtil.java:32) > [aurora-110.jar:na] > at > org.apache.aurora.scheduler.base.AsyncUtil$1.afterExecute(AsyncUtil.java:59) > [aurora-110.jar:na] > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1150) > [na:1.8.0_66-Tw8r9b1] > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) > [na:1.8.0_66-Tw8r9b1] > at java.lang.Thread.run(Thread.java:745) [na:1.8.0_66-Tw8r9b1] > Caused by: java.util.NoSuchElementException: null > at com.google.common.collect.Iterables.getLast(Iterables.java:784) > ~[guava-19.0.jar:na] > at > org.apache.aurora.scheduler.base.Tasks.getLatestEvent(Tasks.java:149) > ~[aurora-110.jar:na] > at org.apache.aurora.scheduler.base.Tasks$1.apply(Tasks.java:156) > ~[aurora-110.jar:na] > at org.apache.aurora.scheduler.base.Tasks$1.apply(Tasks.java:153) > ~[aurora-110.jar:na] > at > com.google.common.collect.ByFunctionOrdering.compare(ByFunctionOrdering.java:45) > ~[guava-19.0.jar:na] > at java.util.TimSort.binarySort(TimSort.java:296) > ~[na:1.8.0_66-Tw8r9b1] > at java.util.TimSort.sort(TimSort.java:239) ~[na:1.8.0_66-Tw8r9b1] > at java.util.Arrays.sort(Arrays.java:1438) ~[na:1.8.0_66-Tw8r9b1] > at com.google.common.collect.Ordering.sortedCopy(Ordering.java:860) > ~[guava-19.0.jar:na] > at > org.apache.aurora.scheduler.pruning.TaskHistoryPruner.lambda$registerInactiveTask$20(TaskHistoryPruner.java:156) > ~[aurora-110.jar:na] > at > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) > ~[na:1.8.0_66-Tw8r9b1] > at java.util.concurrent.FutureTask.run(FutureTask.java:266) > ~[na:1.8.0_66-Tw8r9b1] > at > java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180) > ~[na:1.8.0_66-Tw8r9b1] > at > java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293) > ~[na:1.8.0_66-Tw8r9b1] > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) > [na:1.8.0_66-Tw8r9b1] > ... 2 common frames omitted > {noformat} > Similar exception occurs within the preemptor and causes the scheduler to > crash: > {noformat} > E0113 01:43:06.242 THREAD5037 > com.google.common.util.concurrent.ServiceManager$ServiceListener.failed: > Service PreemptorService [FAILED] has failed in the RUNNING state. > java.util.NoSuchElementException > at com.google.common.collect.Iterables.getLast(Iterables.java:784) > at > org.apache.aurora.scheduler.base.Tasks.getLatestEvent(Tasks.java:149) > at > org.apache.aurora.scheduler.preemptor.PendingTaskProcessor$3.apply(PendingTaskProcessor.java:237) > at > org.apache.aurora.scheduler.preemptor.PendingTaskProcessor$3.apply(PendingTaskProcessor.java:234) > at > com.google.common.base.Predicates$AndPredicate.apply(Predicates.java:374) > at > com.google.common.collect.Iterators$7.computeNext(Iterators.java:675) > at > com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:143) > at > com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:138) > at > com.google.common.collect.TransformedIterator.hasNext(TransformedIterator.java:43) > at com.google.common.collect.Iterators.addAll(Iterators.java:364) > at com.google.common.collect.Iterables.addAl
[jira] [Assigned] (AURORA-1580) java.util.NoSuchElementException from Tasks.getLatestEvent with DbTaskStore
[ https://issues.apache.org/jira/browse/AURORA-1580?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bill Farner reassigned AURORA-1580: --- Assignee: Bill Farner (was: Zameer Manji) > java.util.NoSuchElementException from Tasks.getLatestEvent with DbTaskStore > --- > > Key: AURORA-1580 > URL: https://issues.apache.org/jira/browse/AURORA-1580 > Project: Aurora > Issue Type: Bug > Components: Scheduler >Reporter: Zameer Manji >Assignee: Bill Farner > > I have discovered the following exception from a scheduler that is running > off master with the beta task store enabled. > {noformat} > E0113 22:51:55.941 [AsyncProcessor-2, AsyncUtil:123] > java.util.concurrent.ExecutionException: java.util.NoSuchElementException > java.util.concurrent.ExecutionException: java.util.NoSuchElementException > at java.util.concurrent.FutureTask.report(FutureTask.java:122) > ~[na:1.8.0_66-Tw8r9b1] > at java.util.concurrent.FutureTask.get(FutureTask.java:192) > ~[na:1.8.0_66-Tw8r9b1] > at > org.apache.aurora.scheduler.base.AsyncUtil.evaluateResult(AsyncUtil.java:118) > [aurora-110.jar:na] > at > org.apache.aurora.scheduler.base.AsyncUtil.access$000(AsyncUtil.java:32) > [aurora-110.jar:na] > at > org.apache.aurora.scheduler.base.AsyncUtil$1.afterExecute(AsyncUtil.java:59) > [aurora-110.jar:na] > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1150) > [na:1.8.0_66-Tw8r9b1] > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) > [na:1.8.0_66-Tw8r9b1] > at java.lang.Thread.run(Thread.java:745) [na:1.8.0_66-Tw8r9b1] > Caused by: java.util.NoSuchElementException: null > at com.google.common.collect.Iterables.getLast(Iterables.java:784) > ~[guava-19.0.jar:na] > at > org.apache.aurora.scheduler.base.Tasks.getLatestEvent(Tasks.java:149) > ~[aurora-110.jar:na] > at org.apache.aurora.scheduler.base.Tasks$1.apply(Tasks.java:156) > ~[aurora-110.jar:na] > at org.apache.aurora.scheduler.base.Tasks$1.apply(Tasks.java:153) > ~[aurora-110.jar:na] > at > com.google.common.collect.ByFunctionOrdering.compare(ByFunctionOrdering.java:45) > ~[guava-19.0.jar:na] > at java.util.TimSort.binarySort(TimSort.java:296) > ~[na:1.8.0_66-Tw8r9b1] > at java.util.TimSort.sort(TimSort.java:239) ~[na:1.8.0_66-Tw8r9b1] > at java.util.Arrays.sort(Arrays.java:1438) ~[na:1.8.0_66-Tw8r9b1] > at com.google.common.collect.Ordering.sortedCopy(Ordering.java:860) > ~[guava-19.0.jar:na] > at > org.apache.aurora.scheduler.pruning.TaskHistoryPruner.lambda$registerInactiveTask$20(TaskHistoryPruner.java:156) > ~[aurora-110.jar:na] > at > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) > ~[na:1.8.0_66-Tw8r9b1] > at java.util.concurrent.FutureTask.run(FutureTask.java:266) > ~[na:1.8.0_66-Tw8r9b1] > at > java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180) > ~[na:1.8.0_66-Tw8r9b1] > at > java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293) > ~[na:1.8.0_66-Tw8r9b1] > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) > [na:1.8.0_66-Tw8r9b1] > ... 2 common frames omitted > {noformat} > Similar exception occurs within the preemptor and causes the scheduler to > crash: > {noformat} > E0113 01:43:06.242 THREAD5037 > com.google.common.util.concurrent.ServiceManager$ServiceListener.failed: > Service PreemptorService [FAILED] has failed in the RUNNING state. > java.util.NoSuchElementException > at com.google.common.collect.Iterables.getLast(Iterables.java:784) > at > org.apache.aurora.scheduler.base.Tasks.getLatestEvent(Tasks.java:149) > at > org.apache.aurora.scheduler.preemptor.PendingTaskProcessor$3.apply(PendingTaskProcessor.java:237) > at > org.apache.aurora.scheduler.preemptor.PendingTaskProcessor$3.apply(PendingTaskProcessor.java:234) > at > com.google.common.base.Predicates$AndPredicate.apply(Predicates.java:374) > at > com.google.common.collect.Iterators$7.computeNext(Iterators.java:675) > at > com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:143) > at > com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:138) > at > com.google.common.collect.TransformedIterator.hasNext(TransformedIterator.java:43) > at com.google.common.collect.Iterators.addAll(Iterators.java:364) > at com.google.common.collect.Iterables.addAll(Iterables.java:352) > at com.g
[jira] [Commented] (AURORA-1052) Populate Labels in TaskConfig
[ https://issues.apache.org/jira/browse/AURORA-1052?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=1568#comment-1568 ] Zameer Manji commented on AURORA-1052: -- My concern is that if we have user supplied labels in TaskConfig, we will have a backwards compatibility issue if the scheduler starts populating labels as well. +1 to step #2. > Populate Labels in TaskConfig > - > > Key: AURORA-1052 > URL: https://issues.apache.org/jira/browse/AURORA-1052 > Project: Aurora > Issue Type: Story > Components: Scheduler >Reporter: Stephan Erb >Priority: Minor > Labels: newbie > > Mesos has introduced labels on tasks (MESOS-2120). These correspond to what > Aurora calls metadata. > We should therefore set task labels according to our metadata information. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (AURORA-1580) java.util.NoSuchElementException from Tasks.getLatestEvent with DbTaskStore
[ https://issues.apache.org/jira/browse/AURORA-1580?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15111055#comment-15111055 ] Bill Farner commented on AURORA-1580: - {quote} Is there a reason they were done as a cascading delete + re-insert to begin with? Was it just for the sake of not having to diff the task to see what had changed? {quote} That's effectively it. The storage API is non-specific about what is being changed in the {{IScheduledTask}} object tree, essentially a PUT. While we could implement that, i think specific mutation verbs (of which there are few) is much more straightforward. > java.util.NoSuchElementException from Tasks.getLatestEvent with DbTaskStore > --- > > Key: AURORA-1580 > URL: https://issues.apache.org/jira/browse/AURORA-1580 > Project: Aurora > Issue Type: Bug > Components: Scheduler >Reporter: Zameer Manji >Assignee: Zameer Manji > > I have discovered the following exception from a scheduler that is running > off master with the beta task store enabled. > {noformat} > E0113 22:51:55.941 [AsyncProcessor-2, AsyncUtil:123] > java.util.concurrent.ExecutionException: java.util.NoSuchElementException > java.util.concurrent.ExecutionException: java.util.NoSuchElementException > at java.util.concurrent.FutureTask.report(FutureTask.java:122) > ~[na:1.8.0_66-Tw8r9b1] > at java.util.concurrent.FutureTask.get(FutureTask.java:192) > ~[na:1.8.0_66-Tw8r9b1] > at > org.apache.aurora.scheduler.base.AsyncUtil.evaluateResult(AsyncUtil.java:118) > [aurora-110.jar:na] > at > org.apache.aurora.scheduler.base.AsyncUtil.access$000(AsyncUtil.java:32) > [aurora-110.jar:na] > at > org.apache.aurora.scheduler.base.AsyncUtil$1.afterExecute(AsyncUtil.java:59) > [aurora-110.jar:na] > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1150) > [na:1.8.0_66-Tw8r9b1] > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) > [na:1.8.0_66-Tw8r9b1] > at java.lang.Thread.run(Thread.java:745) [na:1.8.0_66-Tw8r9b1] > Caused by: java.util.NoSuchElementException: null > at com.google.common.collect.Iterables.getLast(Iterables.java:784) > ~[guava-19.0.jar:na] > at > org.apache.aurora.scheduler.base.Tasks.getLatestEvent(Tasks.java:149) > ~[aurora-110.jar:na] > at org.apache.aurora.scheduler.base.Tasks$1.apply(Tasks.java:156) > ~[aurora-110.jar:na] > at org.apache.aurora.scheduler.base.Tasks$1.apply(Tasks.java:153) > ~[aurora-110.jar:na] > at > com.google.common.collect.ByFunctionOrdering.compare(ByFunctionOrdering.java:45) > ~[guava-19.0.jar:na] > at java.util.TimSort.binarySort(TimSort.java:296) > ~[na:1.8.0_66-Tw8r9b1] > at java.util.TimSort.sort(TimSort.java:239) ~[na:1.8.0_66-Tw8r9b1] > at java.util.Arrays.sort(Arrays.java:1438) ~[na:1.8.0_66-Tw8r9b1] > at com.google.common.collect.Ordering.sortedCopy(Ordering.java:860) > ~[guava-19.0.jar:na] > at > org.apache.aurora.scheduler.pruning.TaskHistoryPruner.lambda$registerInactiveTask$20(TaskHistoryPruner.java:156) > ~[aurora-110.jar:na] > at > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) > ~[na:1.8.0_66-Tw8r9b1] > at java.util.concurrent.FutureTask.run(FutureTask.java:266) > ~[na:1.8.0_66-Tw8r9b1] > at > java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180) > ~[na:1.8.0_66-Tw8r9b1] > at > java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293) > ~[na:1.8.0_66-Tw8r9b1] > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) > [na:1.8.0_66-Tw8r9b1] > ... 2 common frames omitted > {noformat} > Similar exception occurs within the preemptor and causes the scheduler to > crash: > {noformat} > E0113 01:43:06.242 THREAD5037 > com.google.common.util.concurrent.ServiceManager$ServiceListener.failed: > Service PreemptorService [FAILED] has failed in the RUNNING state. > java.util.NoSuchElementException > at com.google.common.collect.Iterables.getLast(Iterables.java:784) > at > org.apache.aurora.scheduler.base.Tasks.getLatestEvent(Tasks.java:149) > at > org.apache.aurora.scheduler.preemptor.PendingTaskProcessor$3.apply(PendingTaskProcessor.java:237) > at > org.apache.aurora.scheduler.preemptor.PendingTaskProcessor$3.apply(PendingTaskProcessor.java:234) > at > com.google.common.base.Predicates$AndPredicate.apply(Predicates.java:374) > at > com.google.common.collect.Iterators$7.computeNext(Iterators.java:675) > at >
[jira] [Comment Edited] (AURORA-1580) java.util.NoSuchElementException from Tasks.getLatestEvent with DbTaskStore
[ https://issues.apache.org/jira/browse/AURORA-1580?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15111038#comment-15111038 ] Joshua Cohen edited comment on AURORA-1580 at 1/21/16 6:23 PM: --- {quote} I'd like to add, however, that implementing task mutations as insert/update rather than a full relational delete + re-insert would also address the issue. {quote} This seems like a better solution to me. Is there a reason they were done as a cascading delete + re-insert to begin with? Was it just for the sake of not having to diff the task to see what had changed? was (Author: joshua.cohen): {noformat} I'd like to add, however, that implementing task mutations as insert/update rather than a full relational delete + re-insert would also address the issue. {noformat} This seems like a better solution to me. Is there a reason they were done as a cascading delete + re-insert to begin with? Was it just for the sake of not having to diff the task to see what had changed? > java.util.NoSuchElementException from Tasks.getLatestEvent with DbTaskStore > --- > > Key: AURORA-1580 > URL: https://issues.apache.org/jira/browse/AURORA-1580 > Project: Aurora > Issue Type: Bug > Components: Scheduler >Reporter: Zameer Manji >Assignee: Zameer Manji > > I have discovered the following exception from a scheduler that is running > off master with the beta task store enabled. > {noformat} > E0113 22:51:55.941 [AsyncProcessor-2, AsyncUtil:123] > java.util.concurrent.ExecutionException: java.util.NoSuchElementException > java.util.concurrent.ExecutionException: java.util.NoSuchElementException > at java.util.concurrent.FutureTask.report(FutureTask.java:122) > ~[na:1.8.0_66-Tw8r9b1] > at java.util.concurrent.FutureTask.get(FutureTask.java:192) > ~[na:1.8.0_66-Tw8r9b1] > at > org.apache.aurora.scheduler.base.AsyncUtil.evaluateResult(AsyncUtil.java:118) > [aurora-110.jar:na] > at > org.apache.aurora.scheduler.base.AsyncUtil.access$000(AsyncUtil.java:32) > [aurora-110.jar:na] > at > org.apache.aurora.scheduler.base.AsyncUtil$1.afterExecute(AsyncUtil.java:59) > [aurora-110.jar:na] > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1150) > [na:1.8.0_66-Tw8r9b1] > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) > [na:1.8.0_66-Tw8r9b1] > at java.lang.Thread.run(Thread.java:745) [na:1.8.0_66-Tw8r9b1] > Caused by: java.util.NoSuchElementException: null > at com.google.common.collect.Iterables.getLast(Iterables.java:784) > ~[guava-19.0.jar:na] > at > org.apache.aurora.scheduler.base.Tasks.getLatestEvent(Tasks.java:149) > ~[aurora-110.jar:na] > at org.apache.aurora.scheduler.base.Tasks$1.apply(Tasks.java:156) > ~[aurora-110.jar:na] > at org.apache.aurora.scheduler.base.Tasks$1.apply(Tasks.java:153) > ~[aurora-110.jar:na] > at > com.google.common.collect.ByFunctionOrdering.compare(ByFunctionOrdering.java:45) > ~[guava-19.0.jar:na] > at java.util.TimSort.binarySort(TimSort.java:296) > ~[na:1.8.0_66-Tw8r9b1] > at java.util.TimSort.sort(TimSort.java:239) ~[na:1.8.0_66-Tw8r9b1] > at java.util.Arrays.sort(Arrays.java:1438) ~[na:1.8.0_66-Tw8r9b1] > at com.google.common.collect.Ordering.sortedCopy(Ordering.java:860) > ~[guava-19.0.jar:na] > at > org.apache.aurora.scheduler.pruning.TaskHistoryPruner.lambda$registerInactiveTask$20(TaskHistoryPruner.java:156) > ~[aurora-110.jar:na] > at > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) > ~[na:1.8.0_66-Tw8r9b1] > at java.util.concurrent.FutureTask.run(FutureTask.java:266) > ~[na:1.8.0_66-Tw8r9b1] > at > java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180) > ~[na:1.8.0_66-Tw8r9b1] > at > java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293) > ~[na:1.8.0_66-Tw8r9b1] > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) > [na:1.8.0_66-Tw8r9b1] > ... 2 common frames omitted > {noformat} > Similar exception occurs within the preemptor and causes the scheduler to > crash: > {noformat} > E0113 01:43:06.242 THREAD5037 > com.google.common.util.concurrent.ServiceManager$ServiceListener.failed: > Service PreemptorService [FAILED] has failed in the RUNNING state. > java.util.NoSuchElementException > at com.google.common.collect.Iterables.getLast(Iterables.java:784) > at > org.apache.aurora.scheduler.base.Tasks.getLatestEvent(Tasks.java:149) > at > org.
[jira] [Commented] (AURORA-1580) java.util.NoSuchElementException from Tasks.getLatestEvent with DbTaskStore
[ https://issues.apache.org/jira/browse/AURORA-1580?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15111038#comment-15111038 ] Joshua Cohen commented on AURORA-1580: -- {noformat} I'd like to add, however, that implementing task mutations as insert/update rather than a full relational delete + re-insert would also address the issue. {noformat} This seems like a better solution to me. Is there a reason they were done as a cascading delete + re-insert to begin with? Was it just for the sake of not having to diff the task to see what had changed? > java.util.NoSuchElementException from Tasks.getLatestEvent with DbTaskStore > --- > > Key: AURORA-1580 > URL: https://issues.apache.org/jira/browse/AURORA-1580 > Project: Aurora > Issue Type: Bug > Components: Scheduler >Reporter: Zameer Manji >Assignee: Zameer Manji > > I have discovered the following exception from a scheduler that is running > off master with the beta task store enabled. > {noformat} > E0113 22:51:55.941 [AsyncProcessor-2, AsyncUtil:123] > java.util.concurrent.ExecutionException: java.util.NoSuchElementException > java.util.concurrent.ExecutionException: java.util.NoSuchElementException > at java.util.concurrent.FutureTask.report(FutureTask.java:122) > ~[na:1.8.0_66-Tw8r9b1] > at java.util.concurrent.FutureTask.get(FutureTask.java:192) > ~[na:1.8.0_66-Tw8r9b1] > at > org.apache.aurora.scheduler.base.AsyncUtil.evaluateResult(AsyncUtil.java:118) > [aurora-110.jar:na] > at > org.apache.aurora.scheduler.base.AsyncUtil.access$000(AsyncUtil.java:32) > [aurora-110.jar:na] > at > org.apache.aurora.scheduler.base.AsyncUtil$1.afterExecute(AsyncUtil.java:59) > [aurora-110.jar:na] > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1150) > [na:1.8.0_66-Tw8r9b1] > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) > [na:1.8.0_66-Tw8r9b1] > at java.lang.Thread.run(Thread.java:745) [na:1.8.0_66-Tw8r9b1] > Caused by: java.util.NoSuchElementException: null > at com.google.common.collect.Iterables.getLast(Iterables.java:784) > ~[guava-19.0.jar:na] > at > org.apache.aurora.scheduler.base.Tasks.getLatestEvent(Tasks.java:149) > ~[aurora-110.jar:na] > at org.apache.aurora.scheduler.base.Tasks$1.apply(Tasks.java:156) > ~[aurora-110.jar:na] > at org.apache.aurora.scheduler.base.Tasks$1.apply(Tasks.java:153) > ~[aurora-110.jar:na] > at > com.google.common.collect.ByFunctionOrdering.compare(ByFunctionOrdering.java:45) > ~[guava-19.0.jar:na] > at java.util.TimSort.binarySort(TimSort.java:296) > ~[na:1.8.0_66-Tw8r9b1] > at java.util.TimSort.sort(TimSort.java:239) ~[na:1.8.0_66-Tw8r9b1] > at java.util.Arrays.sort(Arrays.java:1438) ~[na:1.8.0_66-Tw8r9b1] > at com.google.common.collect.Ordering.sortedCopy(Ordering.java:860) > ~[guava-19.0.jar:na] > at > org.apache.aurora.scheduler.pruning.TaskHistoryPruner.lambda$registerInactiveTask$20(TaskHistoryPruner.java:156) > ~[aurora-110.jar:na] > at > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) > ~[na:1.8.0_66-Tw8r9b1] > at java.util.concurrent.FutureTask.run(FutureTask.java:266) > ~[na:1.8.0_66-Tw8r9b1] > at > java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180) > ~[na:1.8.0_66-Tw8r9b1] > at > java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293) > ~[na:1.8.0_66-Tw8r9b1] > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) > [na:1.8.0_66-Tw8r9b1] > ... 2 common frames omitted > {noformat} > Similar exception occurs within the preemptor and causes the scheduler to > crash: > {noformat} > E0113 01:43:06.242 THREAD5037 > com.google.common.util.concurrent.ServiceManager$ServiceListener.failed: > Service PreemptorService [FAILED] has failed in the RUNNING state. > java.util.NoSuchElementException > at com.google.common.collect.Iterables.getLast(Iterables.java:784) > at > org.apache.aurora.scheduler.base.Tasks.getLatestEvent(Tasks.java:149) > at > org.apache.aurora.scheduler.preemptor.PendingTaskProcessor$3.apply(PendingTaskProcessor.java:237) > at > org.apache.aurora.scheduler.preemptor.PendingTaskProcessor$3.apply(PendingTaskProcessor.java:234) > at > com.google.common.base.Predicates$AndPredicate.apply(Predicates.java:374) > at > com.google.common.collect.Iterators$7.computeNext(Iterators.java:675) > at > com.google.common.collect.AbstractIterator.tryToComputeNex
[jira] [Commented] (AURORA-1580) java.util.NoSuchElementException from Tasks.getLatestEvent with DbTaskStore
[ https://issues.apache.org/jira/browse/AURORA-1580?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15111031#comment-15111031 ] Bill Farner commented on AURORA-1580: - I have determined why READ COMMITTED does not work in the above test case. Thankfully it is a SQL-level concurrency problem. Here is our current sequence of events for saving a task: 1. {{DELETE FROM tasks WHERE task_id IN ( ? )}} 2. {{INSERT INTO tasks ...}} 3. {{INSERT INTO task_events ...}} 4. {{INSERT INTO task_ports ...}} Here is the sequence for reading a task: a. {{SELECT ... FROM tasks WHERE task_id = ?}} b. {{SELECT ... FROM task_ports WHERE e.task_row_id = ?}} c. {{SELECT ... FROM task_events WHERE e.task_row_id = ?}} When READ COMMITTED behaves as expected, we can encounter this issue with a sequence like a1234bc, since operations b and c refer to a since-deleted {{task_row_id}}. Therefore being functional with READ COMMITTED should just be a matter of using {{task_id}} consistently as the where clause. I'd like to add, however, that implementing task mutations as insert/update rather than a full relational delete + re-insert would also address the issue. > java.util.NoSuchElementException from Tasks.getLatestEvent with DbTaskStore > --- > > Key: AURORA-1580 > URL: https://issues.apache.org/jira/browse/AURORA-1580 > Project: Aurora > Issue Type: Bug > Components: Scheduler >Reporter: Zameer Manji >Assignee: Zameer Manji > > I have discovered the following exception from a scheduler that is running > off master with the beta task store enabled. > {noformat} > E0113 22:51:55.941 [AsyncProcessor-2, AsyncUtil:123] > java.util.concurrent.ExecutionException: java.util.NoSuchElementException > java.util.concurrent.ExecutionException: java.util.NoSuchElementException > at java.util.concurrent.FutureTask.report(FutureTask.java:122) > ~[na:1.8.0_66-Tw8r9b1] > at java.util.concurrent.FutureTask.get(FutureTask.java:192) > ~[na:1.8.0_66-Tw8r9b1] > at > org.apache.aurora.scheduler.base.AsyncUtil.evaluateResult(AsyncUtil.java:118) > [aurora-110.jar:na] > at > org.apache.aurora.scheduler.base.AsyncUtil.access$000(AsyncUtil.java:32) > [aurora-110.jar:na] > at > org.apache.aurora.scheduler.base.AsyncUtil$1.afterExecute(AsyncUtil.java:59) > [aurora-110.jar:na] > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1150) > [na:1.8.0_66-Tw8r9b1] > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) > [na:1.8.0_66-Tw8r9b1] > at java.lang.Thread.run(Thread.java:745) [na:1.8.0_66-Tw8r9b1] > Caused by: java.util.NoSuchElementException: null > at com.google.common.collect.Iterables.getLast(Iterables.java:784) > ~[guava-19.0.jar:na] > at > org.apache.aurora.scheduler.base.Tasks.getLatestEvent(Tasks.java:149) > ~[aurora-110.jar:na] > at org.apache.aurora.scheduler.base.Tasks$1.apply(Tasks.java:156) > ~[aurora-110.jar:na] > at org.apache.aurora.scheduler.base.Tasks$1.apply(Tasks.java:153) > ~[aurora-110.jar:na] > at > com.google.common.collect.ByFunctionOrdering.compare(ByFunctionOrdering.java:45) > ~[guava-19.0.jar:na] > at java.util.TimSort.binarySort(TimSort.java:296) > ~[na:1.8.0_66-Tw8r9b1] > at java.util.TimSort.sort(TimSort.java:239) ~[na:1.8.0_66-Tw8r9b1] > at java.util.Arrays.sort(Arrays.java:1438) ~[na:1.8.0_66-Tw8r9b1] > at com.google.common.collect.Ordering.sortedCopy(Ordering.java:860) > ~[guava-19.0.jar:na] > at > org.apache.aurora.scheduler.pruning.TaskHistoryPruner.lambda$registerInactiveTask$20(TaskHistoryPruner.java:156) > ~[aurora-110.jar:na] > at > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) > ~[na:1.8.0_66-Tw8r9b1] > at java.util.concurrent.FutureTask.run(FutureTask.java:266) > ~[na:1.8.0_66-Tw8r9b1] > at > java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180) > ~[na:1.8.0_66-Tw8r9b1] > at > java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293) > ~[na:1.8.0_66-Tw8r9b1] > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) > [na:1.8.0_66-Tw8r9b1] > ... 2 common frames omitted > {noformat} > Similar exception occurs within the preemptor and causes the scheduler to > crash: > {noformat} > E0113 01:43:06.242 THREAD5037 > com.google.common.util.concurrent.ServiceManager$ServiceListener.failed: > Service PreemptorService [FAILED] has failed in the RUNNING state. > java.util.NoSuchElementException > at com.google.common.col