[jira] [Commented] (AURORA-1589) Dupe debs are showing up

2016-01-21 Thread John Sirois (JIRA)

[ 
https://issues.apache.org/jira/browse/AURORA-1589?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15111772#comment-15111772
 ] 

John Sirois commented on AURORA-1589:
-

I'd self-serve on this sort of thing if I had access to the jenkins job configs 
or if you turned the current job config for aurora-packaging into a single call 
to a script checked into the aurora-packaging repo for easier future 
self-serve, normal RB process.

> Dupe debs are showing up
> 
>
> Key: AURORA-1589
> URL: https://issues.apache.org/jira/browse/AURORA-1589
> Project: Aurora
>  Issue Type: Story
>Reporter: Dmitriy Shirchenko
>Assignee: Bill Farner
>
> Looks like jessie and trusty packages are colliding and trusty (last one to 
> build) is winning out:
> https://builds.apache.org/job/aurora-packaging-nightly/lastSuccessfulBuild/artifact/aurora-packaging/artifacts/



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (AURORA-1589) Dupe debs are showing up

2016-01-21 Thread John Sirois (JIRA)

[ 
https://issues.apache.org/jira/browse/AURORA-1589?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15111769#comment-15111769
 ] 

John Sirois commented on AURORA-1589:
-

I think it would be strange.  Better would be to ship the artifacts off to svn 
(https://dist.apache.org/repos/dist/dev/aurora) IMO as more officially served 
nightlies if we want to support nightlies at all.  I'm unfamiliar with any 
intervening Apache protocols though on nightlies.

> Dupe debs are showing up
> 
>
> Key: AURORA-1589
> URL: https://issues.apache.org/jira/browse/AURORA-1589
> Project: Aurora
>  Issue Type: Story
>Reporter: Dmitriy Shirchenko
>Assignee: Bill Farner
>
> Looks like jessie and trusty packages are colliding and trusty (last one to 
> build) is winning out:
> https://builds.apache.org/job/aurora-packaging-nightly/lastSuccessfulBuild/artifact/aurora-packaging/artifacts/



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (AURORA-1590) Make it possible to view task event dates in UTC in the scheduler UI

2016-01-21 Thread Joshua Cohen (JIRA)
Joshua Cohen created AURORA-1590:


 Summary: Make it possible to view task event dates in UTC in the 
scheduler UI
 Key: AURORA-1590
 URL: https://issues.apache.org/jira/browse/AURORA-1590
 Project: Aurora
  Issue Type: Task
  Components: UI
Reporter: Joshua Cohen
Priority: Minor


On the update page, if you hover over a local date, we show you a tooltip that 
contains that date in UTC time. It would be nice if we did this consistently, 
most notably though, this would be useful for task event dates on the job page.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (AURORA-1589) Dupe debs are showing up

2016-01-21 Thread Bill Farner (JIRA)

[ 
https://issues.apache.org/jira/browse/AURORA-1589?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15111765#comment-15111765
 ] 

Bill Farner commented on AURORA-1589:
-

The clobbering is not happening (yet, anyway).  What you see on that page is 
the result of jenkins configuration

"Files to archive": 
aurora-packaging/artifacts/aurora-ubuntu-trusty/dist/*.deb,aurora-packaging/artifacts/aurora-centos-7/dist/rpmbuild/RPMS/x86_64/*.rpm

They probably do step on each other, though, so we will need to see if jenkins 
can include a path in the archive name.  Alternatively - would it be strange to 
include the dist name in the deb file name?

> Dupe debs are showing up
> 
>
> Key: AURORA-1589
> URL: https://issues.apache.org/jira/browse/AURORA-1589
> Project: Aurora
>  Issue Type: Story
>Reporter: Dmitriy Shirchenko
>Assignee: Bill Farner
>
> Looks like jessie and trusty packages are colliding and trusty (last one to 
> build) is winning out:
> https://builds.apache.org/job/aurora-packaging-nightly/lastSuccessfulBuild/artifact/aurora-packaging/artifacts/



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (AURORA-1589) Dupe debs are showing up

2016-01-21 Thread Dmitriy Shirchenko (JIRA)

[ 
https://issues.apache.org/jira/browse/AURORA-1589?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15111759#comment-15111759
 ] 

Dmitriy Shirchenko commented on AURORA-1589:


cc [~jsirois]

> Dupe debs are showing up
> 
>
> Key: AURORA-1589
> URL: https://issues.apache.org/jira/browse/AURORA-1589
> Project: Aurora
>  Issue Type: Story
>Reporter: Dmitriy Shirchenko
>Assignee: Bill Farner
>
> Looks like jessie and trusty packages are colliding and trusty (last one to 
> build) is winning out:
> https://builds.apache.org/job/aurora-packaging-nightly/lastSuccessfulBuild/artifact/aurora-packaging/artifacts/



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (AURORA-1582) Task History Pruning attempts can fail silently

2016-01-21 Thread Zameer Manji (JIRA)

[ 
https://issues.apache.org/jira/browse/AURORA-1582?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15111758#comment-15111758
 ] 

Zameer Manji edited comment on AURORA-1582 at 1/22/16 1:39 AM:
---

{noformat}
commit c89fecbcd93aa6ddcf6af60f2a2cb6315b7a4d19
Author: Zameer Manji 
Date:   Thu Jan 21 17:38:25 2016 -0800

Turn TaskHistoryPruner into a service and trigger shutdown on pruning 
failure.

Task pruning is key to operating a large cluster and failure to prune should
trigger shutdown to prevent unbounded growth of storage. This patch turns
`TaskHistoryPruner` into a service which propagates failure from failed 
pruning
attempts towards the `ServiceManager`. Also completing a TODO which removes 
a
test for behaviour that is very awkward to test for.

Bugs closed: AURORA-1582

Reviewed at https://reviews.apache.org/r/42332/

 3 files changed, 68 insertions(+), 111 deletions(-)
{noformat}


was (Author: zmanji):

commit c89fecbcd93aa6ddcf6af60f2a2cb6315b7a4d19
Author: Zameer Manji 
Date:   Thu Jan 21 17:38:25 2016 -0800

Turn TaskHistoryPruner into a service and trigger shutdown on pruning 
failure.

Task pruning is key to operating a large cluster and failure to prune should
trigger shutdown to prevent unbounded growth of storage. This patch turns
`TaskHistoryPruner` into a service which propagates failure from failed 
pruning
attempts towards the `ServiceManager`. Also completing a TODO which removes 
a
test for behaviour that is very awkward to test for.

Bugs closed: AURORA-1582

Reviewed at https://reviews.apache.org/r/42332/

 3 files changed, 68 insertions(+), 111 deletions(-)


> Task History Pruning attempts can fail silently
> ---
>
> Key: AURORA-1582
> URL: https://issues.apache.org/jira/browse/AURORA-1582
> Project: Aurora
>  Issue Type: Bug
>Reporter: Zameer Manji
>Assignee: Zameer Manji
>
> As discovered in AURORA-1580, task history pruning attempts can fail and if 
> they do fail, they fail silently. The root cause seems to be that 
> AsyncModule's {{AsyncProcessor}} threads just log the unhandled exception if 
> it exists:
> {noformat}
>   private static void evaluateResult(Runnable runnable, Throwable throwable, 
> Logger logger) {
> // See java.util.concurrent.ThreadPoolExecutor#afterExecute(Runnable, 
> Throwable)
> // for more details and an implementation example.
> if (throwable == null) {
>   if (runnable instanceof Future) {
> try {
>   Future future = (Future) runnable;
>   if (future.isDone()) {
> future.get();
>   }
> } catch (InterruptedException ie) {
>   Thread.currentThread().interrupt();
> } catch (ExecutionException ee) {
>   logger.error(ee.toString(), ee);
> }
>   }
> } else {
>   logger.error(throwable.toString(), throwable);
> }
>   }
> {noformat}
> I think instead of silently failing if work on these threads fail, we should 
> shut down the scheduler, much like how if the preemptor or other guava 
> service fails we shut down the scheduler. This way the scheduler does not 
> enter an undefined state and operators are informed of the abnormal behaviour.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (AURORA-1588) Unnecessary HealthCheckConfig endpoint warning

2016-01-21 Thread Maxim Khutornenko (JIRA)

[ 
https://issues.apache.org/jira/browse/AURORA-1588?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15111756#comment-15111756
 ] 

Maxim Khutornenko commented on AURORA-1588:
---

BTW, this seems to be a general problem with out deprecation messaging. E.g. 
this warning also shows up all the time:
{noformat}
restart_threshold has been deprecated and will be removed in a future release
{noformat}

This is due to default values still applied in the config. A possible solution 
could be dropping defaults where possible. E.g. it's safe to drop default for 
the {{restart_threshold}} but not so for the HealthCheckConfig values as they 
are passed into the executor.

> Unnecessary HealthCheckConfig endpoint warning
> --
>
> Key: AURORA-1588
> URL: https://issues.apache.org/jira/browse/AURORA-1588
> Project: Aurora
>  Issue Type: Bug
>  Components: Client
>Reporter: Maxim Khutornenko
>Priority: Minor
>  Labels: newbie
>
> The recent [health checker refactoring|https://reviews.apache.org/r/41428] 
> added a new message:
> {noformat}
> WARNING: endpoint, expected_response, and expected_response_code are 
> deprecated and will be removed
> in the next release. Please consult updated documentation.
> {noformat}
> This message always shows up even when users don't explicitly specify 
> deprecated fields. It should only warn when the .aurora file contains 
> HealthCheckConfig with any of the deprecated fields.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (AURORA-1589) Dupe debs are showing up

2016-01-21 Thread Dmitriy Shirchenko (JIRA)
Dmitriy Shirchenko created AURORA-1589:
--

 Summary: Dupe debs are showing up
 Key: AURORA-1589
 URL: https://issues.apache.org/jira/browse/AURORA-1589
 Project: Aurora
  Issue Type: Story
Reporter: Dmitriy Shirchenko
Assignee: Bill Farner


Looks like jessie and trusty packages are colliding and trusty (last one to 
build) is winning out:

https://builds.apache.org/job/aurora-packaging-nightly/lastSuccessfulBuild/artifact/aurora-packaging/artifacts/



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (AURORA-1588) Unnecessary HealthCheckConfig endpoint warning

2016-01-21 Thread Bill Farner (JIRA)

 [ 
https://issues.apache.org/jira/browse/AURORA-1588?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bill Farner updated AURORA-1588:

Labels: newbie  (was: )

> Unnecessary HealthCheckConfig endpoint warning
> --
>
> Key: AURORA-1588
> URL: https://issues.apache.org/jira/browse/AURORA-1588
> Project: Aurora
>  Issue Type: Bug
>  Components: Client
>Reporter: Maxim Khutornenko
>  Labels: newbie
>
> The recent [health checker refactoring|https://reviews.apache.org/r/41428] 
> added a new message:
> {noformat}
> WARNING: endpoint, expected_response, and expected_response_code are 
> deprecated and will be removed
> in the next release. Please consult updated documentation.
> {noformat}
> This message always shows up even when users don't explicitly specify 
> deprecated fields. It should only warn when the .aurora file contains 
> HealthCheckConfig with any of the deprecated fields.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (AURORA-1588) Unnecessary HealthCheckConfig endpoint warning

2016-01-21 Thread Bill Farner (JIRA)

 [ 
https://issues.apache.org/jira/browse/AURORA-1588?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bill Farner updated AURORA-1588:

Priority: Minor  (was: Major)

> Unnecessary HealthCheckConfig endpoint warning
> --
>
> Key: AURORA-1588
> URL: https://issues.apache.org/jira/browse/AURORA-1588
> Project: Aurora
>  Issue Type: Bug
>  Components: Client
>Reporter: Maxim Khutornenko
>Priority: Minor
>  Labels: newbie
>
> The recent [health checker refactoring|https://reviews.apache.org/r/41428] 
> added a new message:
> {noformat}
> WARNING: endpoint, expected_response, and expected_response_code are 
> deprecated and will be removed
> in the next release. Please consult updated documentation.
> {noformat}
> This message always shows up even when users don't explicitly specify 
> deprecated fields. It should only warn when the .aurora file contains 
> HealthCheckConfig with any of the deprecated fields.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (AURORA-1588) Unnecessary HealthCheckConfig endpoint warning

2016-01-21 Thread Bill Farner (JIRA)

 [ 
https://issues.apache.org/jira/browse/AURORA-1588?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bill Farner updated AURORA-1588:

Summary: Unnecessary HealthCheckConfig endpoint warning  (was: 
HealthCheckConfig endpoint warning shows up unnecessary )

> Unnecessary HealthCheckConfig endpoint warning
> --
>
> Key: AURORA-1588
> URL: https://issues.apache.org/jira/browse/AURORA-1588
> Project: Aurora
>  Issue Type: Bug
>  Components: Client
>Reporter: Maxim Khutornenko
>
> The recent [health checker refactoring|https://reviews.apache.org/r/41428] 
> added a new message:
> {noformat}
> WARNING: endpoint, expected_response, and expected_response_code are 
> deprecated and will be removed
> in the next release. Please consult updated documentation.
> {noformat}
> This message always shows up even when users don't explicitly specify 
> deprecated fields. It should only warn when the .aurora file contains 
> HealthCheckConfig with any of the deprecated fields.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (AURORA-1588) HealthCheckConfig endpoint warning shows up unnecessary

2016-01-21 Thread Maxim Khutornenko (JIRA)
Maxim Khutornenko created AURORA-1588:
-

 Summary: HealthCheckConfig endpoint warning shows up unnecessary 
 Key: AURORA-1588
 URL: https://issues.apache.org/jira/browse/AURORA-1588
 Project: Aurora
  Issue Type: Bug
  Components: Client
Reporter: Maxim Khutornenko


The recent [health checker refactoring|https://reviews.apache.org/r/41428] 
added a new message:
{noformat}
WARNING: endpoint, expected_response, and expected_response_code are deprecated 
and will be removed
in the next release. Please consult updated documentation.
{noformat}

This message always shows up even when users don't explicitly specify 
deprecated fields. It should only warn when the .aurora file contains 
HealthCheckConfig with any of the deprecated fields.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (AURORA-1580) java.util.NoSuchElementException from Tasks.getLatestEvent with DbTaskStore

2016-01-21 Thread Zameer Manji (JIRA)

 [ 
https://issues.apache.org/jira/browse/AURORA-1580?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zameer Manji updated AURORA-1580:
-
Story Points:   (was: 8)

> java.util.NoSuchElementException from Tasks.getLatestEvent with DbTaskStore
> ---
>
> Key: AURORA-1580
> URL: https://issues.apache.org/jira/browse/AURORA-1580
> Project: Aurora
>  Issue Type: Bug
>  Components: Scheduler
>Reporter: Zameer Manji
>Assignee: Bill Farner
>
> I have discovered the following exception from a scheduler that is running 
> off master with the beta task store enabled.
> {noformat}
> E0113 22:51:55.941 [AsyncProcessor-2, AsyncUtil:123] 
> java.util.concurrent.ExecutionException: java.util.NoSuchElementException 
> java.util.concurrent.ExecutionException: java.util.NoSuchElementException
> at java.util.concurrent.FutureTask.report(FutureTask.java:122) 
> ~[na:1.8.0_66-Tw8r9b1]
> at java.util.concurrent.FutureTask.get(FutureTask.java:192) 
> ~[na:1.8.0_66-Tw8r9b1]
> at 
> org.apache.aurora.scheduler.base.AsyncUtil.evaluateResult(AsyncUtil.java:118) 
> [aurora-110.jar:na]
> at 
> org.apache.aurora.scheduler.base.AsyncUtil.access$000(AsyncUtil.java:32) 
> [aurora-110.jar:na]
> at 
> org.apache.aurora.scheduler.base.AsyncUtil$1.afterExecute(AsyncUtil.java:59) 
> [aurora-110.jar:na]
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1150)
>  [na:1.8.0_66-Tw8r9b1]
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>  [na:1.8.0_66-Tw8r9b1]
> at java.lang.Thread.run(Thread.java:745) [na:1.8.0_66-Tw8r9b1]
> Caused by: java.util.NoSuchElementException: null
> at com.google.common.collect.Iterables.getLast(Iterables.java:784) 
> ~[guava-19.0.jar:na]
> at 
> org.apache.aurora.scheduler.base.Tasks.getLatestEvent(Tasks.java:149) 
> ~[aurora-110.jar:na]
> at org.apache.aurora.scheduler.base.Tasks$1.apply(Tasks.java:156) 
> ~[aurora-110.jar:na]
> at org.apache.aurora.scheduler.base.Tasks$1.apply(Tasks.java:153) 
> ~[aurora-110.jar:na]
> at 
> com.google.common.collect.ByFunctionOrdering.compare(ByFunctionOrdering.java:45)
>  ~[guava-19.0.jar:na]
> at java.util.TimSort.binarySort(TimSort.java:296) 
> ~[na:1.8.0_66-Tw8r9b1]
> at java.util.TimSort.sort(TimSort.java:239) ~[na:1.8.0_66-Tw8r9b1]
> at java.util.Arrays.sort(Arrays.java:1438) ~[na:1.8.0_66-Tw8r9b1]
> at com.google.common.collect.Ordering.sortedCopy(Ordering.java:860) 
> ~[guava-19.0.jar:na]
> at 
> org.apache.aurora.scheduler.pruning.TaskHistoryPruner.lambda$registerInactiveTask$20(TaskHistoryPruner.java:156)
>  ~[aurora-110.jar:na]
> at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) 
> ~[na:1.8.0_66-Tw8r9b1]
> at java.util.concurrent.FutureTask.run(FutureTask.java:266) 
> ~[na:1.8.0_66-Tw8r9b1]
> at 
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)
>  ~[na:1.8.0_66-Tw8r9b1]
> at 
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
>  ~[na:1.8.0_66-Tw8r9b1]
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>  [na:1.8.0_66-Tw8r9b1]
> ... 2 common frames omitted
> {noformat}
> Similar exception occurs within the preemptor and causes the scheduler to 
> crash:
> {noformat}
> E0113 01:43:06.242 THREAD5037 
> com.google.common.util.concurrent.ServiceManager$ServiceListener.failed: 
> Service PreemptorService [FAILED] has failed in the RUNNING state.
> java.util.NoSuchElementException
> at com.google.common.collect.Iterables.getLast(Iterables.java:784)
> at 
> org.apache.aurora.scheduler.base.Tasks.getLatestEvent(Tasks.java:149)
> at 
> org.apache.aurora.scheduler.preemptor.PendingTaskProcessor$3.apply(PendingTaskProcessor.java:237)
> at 
> org.apache.aurora.scheduler.preemptor.PendingTaskProcessor$3.apply(PendingTaskProcessor.java:234)
> at 
> com.google.common.base.Predicates$AndPredicate.apply(Predicates.java:374)
> at 
> com.google.common.collect.Iterators$7.computeNext(Iterators.java:675)
> at 
> com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:143)
> at 
> com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:138)
> at 
> com.google.common.collect.TransformedIterator.hasNext(TransformedIterator.java:43)
> at com.google.common.collect.Iterators.addAll(Iterators.java:364)
> at com.google.common.collect.Iterables.addAll(Iterables.java:352)
> at com.google.common.collect.Ha

[jira] [Updated] (AURORA-1580) java.util.NoSuchElementException from Tasks.getLatestEvent with DbTaskStore

2016-01-21 Thread Zameer Manji (JIRA)

 [ 
https://issues.apache.org/jira/browse/AURORA-1580?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zameer Manji updated AURORA-1580:
-
Sprint:   (was: Twitter Aurora Q1'16 Sprint 17)

> java.util.NoSuchElementException from Tasks.getLatestEvent with DbTaskStore
> ---
>
> Key: AURORA-1580
> URL: https://issues.apache.org/jira/browse/AURORA-1580
> Project: Aurora
>  Issue Type: Bug
>  Components: Scheduler
>Reporter: Zameer Manji
>Assignee: Bill Farner
>
> I have discovered the following exception from a scheduler that is running 
> off master with the beta task store enabled.
> {noformat}
> E0113 22:51:55.941 [AsyncProcessor-2, AsyncUtil:123] 
> java.util.concurrent.ExecutionException: java.util.NoSuchElementException 
> java.util.concurrent.ExecutionException: java.util.NoSuchElementException
> at java.util.concurrent.FutureTask.report(FutureTask.java:122) 
> ~[na:1.8.0_66-Tw8r9b1]
> at java.util.concurrent.FutureTask.get(FutureTask.java:192) 
> ~[na:1.8.0_66-Tw8r9b1]
> at 
> org.apache.aurora.scheduler.base.AsyncUtil.evaluateResult(AsyncUtil.java:118) 
> [aurora-110.jar:na]
> at 
> org.apache.aurora.scheduler.base.AsyncUtil.access$000(AsyncUtil.java:32) 
> [aurora-110.jar:na]
> at 
> org.apache.aurora.scheduler.base.AsyncUtil$1.afterExecute(AsyncUtil.java:59) 
> [aurora-110.jar:na]
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1150)
>  [na:1.8.0_66-Tw8r9b1]
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>  [na:1.8.0_66-Tw8r9b1]
> at java.lang.Thread.run(Thread.java:745) [na:1.8.0_66-Tw8r9b1]
> Caused by: java.util.NoSuchElementException: null
> at com.google.common.collect.Iterables.getLast(Iterables.java:784) 
> ~[guava-19.0.jar:na]
> at 
> org.apache.aurora.scheduler.base.Tasks.getLatestEvent(Tasks.java:149) 
> ~[aurora-110.jar:na]
> at org.apache.aurora.scheduler.base.Tasks$1.apply(Tasks.java:156) 
> ~[aurora-110.jar:na]
> at org.apache.aurora.scheduler.base.Tasks$1.apply(Tasks.java:153) 
> ~[aurora-110.jar:na]
> at 
> com.google.common.collect.ByFunctionOrdering.compare(ByFunctionOrdering.java:45)
>  ~[guava-19.0.jar:na]
> at java.util.TimSort.binarySort(TimSort.java:296) 
> ~[na:1.8.0_66-Tw8r9b1]
> at java.util.TimSort.sort(TimSort.java:239) ~[na:1.8.0_66-Tw8r9b1]
> at java.util.Arrays.sort(Arrays.java:1438) ~[na:1.8.0_66-Tw8r9b1]
> at com.google.common.collect.Ordering.sortedCopy(Ordering.java:860) 
> ~[guava-19.0.jar:na]
> at 
> org.apache.aurora.scheduler.pruning.TaskHistoryPruner.lambda$registerInactiveTask$20(TaskHistoryPruner.java:156)
>  ~[aurora-110.jar:na]
> at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) 
> ~[na:1.8.0_66-Tw8r9b1]
> at java.util.concurrent.FutureTask.run(FutureTask.java:266) 
> ~[na:1.8.0_66-Tw8r9b1]
> at 
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)
>  ~[na:1.8.0_66-Tw8r9b1]
> at 
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
>  ~[na:1.8.0_66-Tw8r9b1]
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>  [na:1.8.0_66-Tw8r9b1]
> ... 2 common frames omitted
> {noformat}
> Similar exception occurs within the preemptor and causes the scheduler to 
> crash:
> {noformat}
> E0113 01:43:06.242 THREAD5037 
> com.google.common.util.concurrent.ServiceManager$ServiceListener.failed: 
> Service PreemptorService [FAILED] has failed in the RUNNING state.
> java.util.NoSuchElementException
> at com.google.common.collect.Iterables.getLast(Iterables.java:784)
> at 
> org.apache.aurora.scheduler.base.Tasks.getLatestEvent(Tasks.java:149)
> at 
> org.apache.aurora.scheduler.preemptor.PendingTaskProcessor$3.apply(PendingTaskProcessor.java:237)
> at 
> org.apache.aurora.scheduler.preemptor.PendingTaskProcessor$3.apply(PendingTaskProcessor.java:234)
> at 
> com.google.common.base.Predicates$AndPredicate.apply(Predicates.java:374)
> at 
> com.google.common.collect.Iterators$7.computeNext(Iterators.java:675)
> at 
> com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:143)
> at 
> com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:138)
> at 
> com.google.common.collect.TransformedIterator.hasNext(TransformedIterator.java:43)
> at com.google.common.collect.Iterators.addAll(Iterators.java:364)
> at com.google.common.collect.Iterables.addAll(Iterables.java:352)
> at com.g

[jira] [Commented] (AURORA-1580) java.util.NoSuchElementException from Tasks.getLatestEvent with DbTaskStore

2016-01-21 Thread Bill Farner (JIRA)

[ 
https://issues.apache.org/jira/browse/AURORA-1580?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15111514#comment-15111514
 ] 

Bill Farner commented on AURORA-1580:
-

https://reviews.apache.org/r/42613/

> java.util.NoSuchElementException from Tasks.getLatestEvent with DbTaskStore
> ---
>
> Key: AURORA-1580
> URL: https://issues.apache.org/jira/browse/AURORA-1580
> Project: Aurora
>  Issue Type: Bug
>  Components: Scheduler
>Reporter: Zameer Manji
>Assignee: Bill Farner
>
> I have discovered the following exception from a scheduler that is running 
> off master with the beta task store enabled.
> {noformat}
> E0113 22:51:55.941 [AsyncProcessor-2, AsyncUtil:123] 
> java.util.concurrent.ExecutionException: java.util.NoSuchElementException 
> java.util.concurrent.ExecutionException: java.util.NoSuchElementException
> at java.util.concurrent.FutureTask.report(FutureTask.java:122) 
> ~[na:1.8.0_66-Tw8r9b1]
> at java.util.concurrent.FutureTask.get(FutureTask.java:192) 
> ~[na:1.8.0_66-Tw8r9b1]
> at 
> org.apache.aurora.scheduler.base.AsyncUtil.evaluateResult(AsyncUtil.java:118) 
> [aurora-110.jar:na]
> at 
> org.apache.aurora.scheduler.base.AsyncUtil.access$000(AsyncUtil.java:32) 
> [aurora-110.jar:na]
> at 
> org.apache.aurora.scheduler.base.AsyncUtil$1.afterExecute(AsyncUtil.java:59) 
> [aurora-110.jar:na]
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1150)
>  [na:1.8.0_66-Tw8r9b1]
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>  [na:1.8.0_66-Tw8r9b1]
> at java.lang.Thread.run(Thread.java:745) [na:1.8.0_66-Tw8r9b1]
> Caused by: java.util.NoSuchElementException: null
> at com.google.common.collect.Iterables.getLast(Iterables.java:784) 
> ~[guava-19.0.jar:na]
> at 
> org.apache.aurora.scheduler.base.Tasks.getLatestEvent(Tasks.java:149) 
> ~[aurora-110.jar:na]
> at org.apache.aurora.scheduler.base.Tasks$1.apply(Tasks.java:156) 
> ~[aurora-110.jar:na]
> at org.apache.aurora.scheduler.base.Tasks$1.apply(Tasks.java:153) 
> ~[aurora-110.jar:na]
> at 
> com.google.common.collect.ByFunctionOrdering.compare(ByFunctionOrdering.java:45)
>  ~[guava-19.0.jar:na]
> at java.util.TimSort.binarySort(TimSort.java:296) 
> ~[na:1.8.0_66-Tw8r9b1]
> at java.util.TimSort.sort(TimSort.java:239) ~[na:1.8.0_66-Tw8r9b1]
> at java.util.Arrays.sort(Arrays.java:1438) ~[na:1.8.0_66-Tw8r9b1]
> at com.google.common.collect.Ordering.sortedCopy(Ordering.java:860) 
> ~[guava-19.0.jar:na]
> at 
> org.apache.aurora.scheduler.pruning.TaskHistoryPruner.lambda$registerInactiveTask$20(TaskHistoryPruner.java:156)
>  ~[aurora-110.jar:na]
> at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) 
> ~[na:1.8.0_66-Tw8r9b1]
> at java.util.concurrent.FutureTask.run(FutureTask.java:266) 
> ~[na:1.8.0_66-Tw8r9b1]
> at 
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)
>  ~[na:1.8.0_66-Tw8r9b1]
> at 
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
>  ~[na:1.8.0_66-Tw8r9b1]
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>  [na:1.8.0_66-Tw8r9b1]
> ... 2 common frames omitted
> {noformat}
> Similar exception occurs within the preemptor and causes the scheduler to 
> crash:
> {noformat}
> E0113 01:43:06.242 THREAD5037 
> com.google.common.util.concurrent.ServiceManager$ServiceListener.failed: 
> Service PreemptorService [FAILED] has failed in the RUNNING state.
> java.util.NoSuchElementException
> at com.google.common.collect.Iterables.getLast(Iterables.java:784)
> at 
> org.apache.aurora.scheduler.base.Tasks.getLatestEvent(Tasks.java:149)
> at 
> org.apache.aurora.scheduler.preemptor.PendingTaskProcessor$3.apply(PendingTaskProcessor.java:237)
> at 
> org.apache.aurora.scheduler.preemptor.PendingTaskProcessor$3.apply(PendingTaskProcessor.java:234)
> at 
> com.google.common.base.Predicates$AndPredicate.apply(Predicates.java:374)
> at 
> com.google.common.collect.Iterators$7.computeNext(Iterators.java:675)
> at 
> com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:143)
> at 
> com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:138)
> at 
> com.google.common.collect.TransformedIterator.hasNext(TransformedIterator.java:43)
> at com.google.common.collect.Iterators.addAll(Iterators.java:364)
> at com.google.common.collect.Iterables.addAl

[jira] [Assigned] (AURORA-1580) java.util.NoSuchElementException from Tasks.getLatestEvent with DbTaskStore

2016-01-21 Thread Bill Farner (JIRA)

 [ 
https://issues.apache.org/jira/browse/AURORA-1580?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bill Farner reassigned AURORA-1580:
---

Assignee: Bill Farner  (was: Zameer Manji)

> java.util.NoSuchElementException from Tasks.getLatestEvent with DbTaskStore
> ---
>
> Key: AURORA-1580
> URL: https://issues.apache.org/jira/browse/AURORA-1580
> Project: Aurora
>  Issue Type: Bug
>  Components: Scheduler
>Reporter: Zameer Manji
>Assignee: Bill Farner
>
> I have discovered the following exception from a scheduler that is running 
> off master with the beta task store enabled.
> {noformat}
> E0113 22:51:55.941 [AsyncProcessor-2, AsyncUtil:123] 
> java.util.concurrent.ExecutionException: java.util.NoSuchElementException 
> java.util.concurrent.ExecutionException: java.util.NoSuchElementException
> at java.util.concurrent.FutureTask.report(FutureTask.java:122) 
> ~[na:1.8.0_66-Tw8r9b1]
> at java.util.concurrent.FutureTask.get(FutureTask.java:192) 
> ~[na:1.8.0_66-Tw8r9b1]
> at 
> org.apache.aurora.scheduler.base.AsyncUtil.evaluateResult(AsyncUtil.java:118) 
> [aurora-110.jar:na]
> at 
> org.apache.aurora.scheduler.base.AsyncUtil.access$000(AsyncUtil.java:32) 
> [aurora-110.jar:na]
> at 
> org.apache.aurora.scheduler.base.AsyncUtil$1.afterExecute(AsyncUtil.java:59) 
> [aurora-110.jar:na]
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1150)
>  [na:1.8.0_66-Tw8r9b1]
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>  [na:1.8.0_66-Tw8r9b1]
> at java.lang.Thread.run(Thread.java:745) [na:1.8.0_66-Tw8r9b1]
> Caused by: java.util.NoSuchElementException: null
> at com.google.common.collect.Iterables.getLast(Iterables.java:784) 
> ~[guava-19.0.jar:na]
> at 
> org.apache.aurora.scheduler.base.Tasks.getLatestEvent(Tasks.java:149) 
> ~[aurora-110.jar:na]
> at org.apache.aurora.scheduler.base.Tasks$1.apply(Tasks.java:156) 
> ~[aurora-110.jar:na]
> at org.apache.aurora.scheduler.base.Tasks$1.apply(Tasks.java:153) 
> ~[aurora-110.jar:na]
> at 
> com.google.common.collect.ByFunctionOrdering.compare(ByFunctionOrdering.java:45)
>  ~[guava-19.0.jar:na]
> at java.util.TimSort.binarySort(TimSort.java:296) 
> ~[na:1.8.0_66-Tw8r9b1]
> at java.util.TimSort.sort(TimSort.java:239) ~[na:1.8.0_66-Tw8r9b1]
> at java.util.Arrays.sort(Arrays.java:1438) ~[na:1.8.0_66-Tw8r9b1]
> at com.google.common.collect.Ordering.sortedCopy(Ordering.java:860) 
> ~[guava-19.0.jar:na]
> at 
> org.apache.aurora.scheduler.pruning.TaskHistoryPruner.lambda$registerInactiveTask$20(TaskHistoryPruner.java:156)
>  ~[aurora-110.jar:na]
> at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) 
> ~[na:1.8.0_66-Tw8r9b1]
> at java.util.concurrent.FutureTask.run(FutureTask.java:266) 
> ~[na:1.8.0_66-Tw8r9b1]
> at 
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)
>  ~[na:1.8.0_66-Tw8r9b1]
> at 
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
>  ~[na:1.8.0_66-Tw8r9b1]
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>  [na:1.8.0_66-Tw8r9b1]
> ... 2 common frames omitted
> {noformat}
> Similar exception occurs within the preemptor and causes the scheduler to 
> crash:
> {noformat}
> E0113 01:43:06.242 THREAD5037 
> com.google.common.util.concurrent.ServiceManager$ServiceListener.failed: 
> Service PreemptorService [FAILED] has failed in the RUNNING state.
> java.util.NoSuchElementException
> at com.google.common.collect.Iterables.getLast(Iterables.java:784)
> at 
> org.apache.aurora.scheduler.base.Tasks.getLatestEvent(Tasks.java:149)
> at 
> org.apache.aurora.scheduler.preemptor.PendingTaskProcessor$3.apply(PendingTaskProcessor.java:237)
> at 
> org.apache.aurora.scheduler.preemptor.PendingTaskProcessor$3.apply(PendingTaskProcessor.java:234)
> at 
> com.google.common.base.Predicates$AndPredicate.apply(Predicates.java:374)
> at 
> com.google.common.collect.Iterators$7.computeNext(Iterators.java:675)
> at 
> com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:143)
> at 
> com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:138)
> at 
> com.google.common.collect.TransformedIterator.hasNext(TransformedIterator.java:43)
> at com.google.common.collect.Iterators.addAll(Iterators.java:364)
> at com.google.common.collect.Iterables.addAll(Iterables.java:352)
> at com.g

[jira] [Commented] (AURORA-1052) Populate Labels in TaskConfig

2016-01-21 Thread Zameer Manji (JIRA)

[ 
https://issues.apache.org/jira/browse/AURORA-1052?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=1568#comment-1568
 ] 

Zameer Manji commented on AURORA-1052:
--

My concern is that if we have user supplied labels in TaskConfig, we will have 
a backwards compatibility issue if the scheduler starts populating labels as 
well. +1 to step #2.

> Populate Labels in TaskConfig
> -
>
> Key: AURORA-1052
> URL: https://issues.apache.org/jira/browse/AURORA-1052
> Project: Aurora
>  Issue Type: Story
>  Components: Scheduler
>Reporter: Stephan Erb
>Priority: Minor
>  Labels: newbie
>
> Mesos has introduced labels on tasks (MESOS-2120). These correspond to what 
> Aurora calls metadata. 
> We should therefore set task labels according to our metadata information.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (AURORA-1580) java.util.NoSuchElementException from Tasks.getLatestEvent with DbTaskStore

2016-01-21 Thread Bill Farner (JIRA)

[ 
https://issues.apache.org/jira/browse/AURORA-1580?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15111055#comment-15111055
 ] 

Bill Farner commented on AURORA-1580:
-

{quote}
Is there a reason they were done as a cascading delete + re-insert to begin 
with? Was it just for the sake of not having to diff the task to see what had 
changed?
{quote}

That's effectively it.  The storage API is non-specific about what is being 
changed in the {{IScheduledTask}} object tree, essentially a PUT.  While we 
could implement that, i think specific mutation verbs (of which there are few) 
is much more straightforward.

> java.util.NoSuchElementException from Tasks.getLatestEvent with DbTaskStore
> ---
>
> Key: AURORA-1580
> URL: https://issues.apache.org/jira/browse/AURORA-1580
> Project: Aurora
>  Issue Type: Bug
>  Components: Scheduler
>Reporter: Zameer Manji
>Assignee: Zameer Manji
>
> I have discovered the following exception from a scheduler that is running 
> off master with the beta task store enabled.
> {noformat}
> E0113 22:51:55.941 [AsyncProcessor-2, AsyncUtil:123] 
> java.util.concurrent.ExecutionException: java.util.NoSuchElementException 
> java.util.concurrent.ExecutionException: java.util.NoSuchElementException
> at java.util.concurrent.FutureTask.report(FutureTask.java:122) 
> ~[na:1.8.0_66-Tw8r9b1]
> at java.util.concurrent.FutureTask.get(FutureTask.java:192) 
> ~[na:1.8.0_66-Tw8r9b1]
> at 
> org.apache.aurora.scheduler.base.AsyncUtil.evaluateResult(AsyncUtil.java:118) 
> [aurora-110.jar:na]
> at 
> org.apache.aurora.scheduler.base.AsyncUtil.access$000(AsyncUtil.java:32) 
> [aurora-110.jar:na]
> at 
> org.apache.aurora.scheduler.base.AsyncUtil$1.afterExecute(AsyncUtil.java:59) 
> [aurora-110.jar:na]
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1150)
>  [na:1.8.0_66-Tw8r9b1]
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>  [na:1.8.0_66-Tw8r9b1]
> at java.lang.Thread.run(Thread.java:745) [na:1.8.0_66-Tw8r9b1]
> Caused by: java.util.NoSuchElementException: null
> at com.google.common.collect.Iterables.getLast(Iterables.java:784) 
> ~[guava-19.0.jar:na]
> at 
> org.apache.aurora.scheduler.base.Tasks.getLatestEvent(Tasks.java:149) 
> ~[aurora-110.jar:na]
> at org.apache.aurora.scheduler.base.Tasks$1.apply(Tasks.java:156) 
> ~[aurora-110.jar:na]
> at org.apache.aurora.scheduler.base.Tasks$1.apply(Tasks.java:153) 
> ~[aurora-110.jar:na]
> at 
> com.google.common.collect.ByFunctionOrdering.compare(ByFunctionOrdering.java:45)
>  ~[guava-19.0.jar:na]
> at java.util.TimSort.binarySort(TimSort.java:296) 
> ~[na:1.8.0_66-Tw8r9b1]
> at java.util.TimSort.sort(TimSort.java:239) ~[na:1.8.0_66-Tw8r9b1]
> at java.util.Arrays.sort(Arrays.java:1438) ~[na:1.8.0_66-Tw8r9b1]
> at com.google.common.collect.Ordering.sortedCopy(Ordering.java:860) 
> ~[guava-19.0.jar:na]
> at 
> org.apache.aurora.scheduler.pruning.TaskHistoryPruner.lambda$registerInactiveTask$20(TaskHistoryPruner.java:156)
>  ~[aurora-110.jar:na]
> at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) 
> ~[na:1.8.0_66-Tw8r9b1]
> at java.util.concurrent.FutureTask.run(FutureTask.java:266) 
> ~[na:1.8.0_66-Tw8r9b1]
> at 
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)
>  ~[na:1.8.0_66-Tw8r9b1]
> at 
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
>  ~[na:1.8.0_66-Tw8r9b1]
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>  [na:1.8.0_66-Tw8r9b1]
> ... 2 common frames omitted
> {noformat}
> Similar exception occurs within the preemptor and causes the scheduler to 
> crash:
> {noformat}
> E0113 01:43:06.242 THREAD5037 
> com.google.common.util.concurrent.ServiceManager$ServiceListener.failed: 
> Service PreemptorService [FAILED] has failed in the RUNNING state.
> java.util.NoSuchElementException
> at com.google.common.collect.Iterables.getLast(Iterables.java:784)
> at 
> org.apache.aurora.scheduler.base.Tasks.getLatestEvent(Tasks.java:149)
> at 
> org.apache.aurora.scheduler.preemptor.PendingTaskProcessor$3.apply(PendingTaskProcessor.java:237)
> at 
> org.apache.aurora.scheduler.preemptor.PendingTaskProcessor$3.apply(PendingTaskProcessor.java:234)
> at 
> com.google.common.base.Predicates$AndPredicate.apply(Predicates.java:374)
> at 
> com.google.common.collect.Iterators$7.computeNext(Iterators.java:675)
> at 
>

[jira] [Comment Edited] (AURORA-1580) java.util.NoSuchElementException from Tasks.getLatestEvent with DbTaskStore

2016-01-21 Thread Joshua Cohen (JIRA)

[ 
https://issues.apache.org/jira/browse/AURORA-1580?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15111038#comment-15111038
 ] 

Joshua Cohen edited comment on AURORA-1580 at 1/21/16 6:23 PM:
---

{quote}
I'd like to add, however, that implementing task mutations as insert/update 
rather than a full relational delete + re-insert would also address the issue.
{quote}

This seems like a better solution to me. Is there a reason they were done as a 
cascading delete + re-insert to begin with? Was it just for the sake of not 
having to diff the task to see what had changed?


was (Author: joshua.cohen):
{noformat}
I'd like to add, however, that implementing task mutations as insert/update 
rather than a full relational delete + re-insert would also address the issue.
{noformat}

This seems like a better solution to me. Is there a reason they were done as a 
cascading delete + re-insert to begin with? Was it just for the sake of not 
having to diff the task to see what had changed?

> java.util.NoSuchElementException from Tasks.getLatestEvent with DbTaskStore
> ---
>
> Key: AURORA-1580
> URL: https://issues.apache.org/jira/browse/AURORA-1580
> Project: Aurora
>  Issue Type: Bug
>  Components: Scheduler
>Reporter: Zameer Manji
>Assignee: Zameer Manji
>
> I have discovered the following exception from a scheduler that is running 
> off master with the beta task store enabled.
> {noformat}
> E0113 22:51:55.941 [AsyncProcessor-2, AsyncUtil:123] 
> java.util.concurrent.ExecutionException: java.util.NoSuchElementException 
> java.util.concurrent.ExecutionException: java.util.NoSuchElementException
> at java.util.concurrent.FutureTask.report(FutureTask.java:122) 
> ~[na:1.8.0_66-Tw8r9b1]
> at java.util.concurrent.FutureTask.get(FutureTask.java:192) 
> ~[na:1.8.0_66-Tw8r9b1]
> at 
> org.apache.aurora.scheduler.base.AsyncUtil.evaluateResult(AsyncUtil.java:118) 
> [aurora-110.jar:na]
> at 
> org.apache.aurora.scheduler.base.AsyncUtil.access$000(AsyncUtil.java:32) 
> [aurora-110.jar:na]
> at 
> org.apache.aurora.scheduler.base.AsyncUtil$1.afterExecute(AsyncUtil.java:59) 
> [aurora-110.jar:na]
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1150)
>  [na:1.8.0_66-Tw8r9b1]
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>  [na:1.8.0_66-Tw8r9b1]
> at java.lang.Thread.run(Thread.java:745) [na:1.8.0_66-Tw8r9b1]
> Caused by: java.util.NoSuchElementException: null
> at com.google.common.collect.Iterables.getLast(Iterables.java:784) 
> ~[guava-19.0.jar:na]
> at 
> org.apache.aurora.scheduler.base.Tasks.getLatestEvent(Tasks.java:149) 
> ~[aurora-110.jar:na]
> at org.apache.aurora.scheduler.base.Tasks$1.apply(Tasks.java:156) 
> ~[aurora-110.jar:na]
> at org.apache.aurora.scheduler.base.Tasks$1.apply(Tasks.java:153) 
> ~[aurora-110.jar:na]
> at 
> com.google.common.collect.ByFunctionOrdering.compare(ByFunctionOrdering.java:45)
>  ~[guava-19.0.jar:na]
> at java.util.TimSort.binarySort(TimSort.java:296) 
> ~[na:1.8.0_66-Tw8r9b1]
> at java.util.TimSort.sort(TimSort.java:239) ~[na:1.8.0_66-Tw8r9b1]
> at java.util.Arrays.sort(Arrays.java:1438) ~[na:1.8.0_66-Tw8r9b1]
> at com.google.common.collect.Ordering.sortedCopy(Ordering.java:860) 
> ~[guava-19.0.jar:na]
> at 
> org.apache.aurora.scheduler.pruning.TaskHistoryPruner.lambda$registerInactiveTask$20(TaskHistoryPruner.java:156)
>  ~[aurora-110.jar:na]
> at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) 
> ~[na:1.8.0_66-Tw8r9b1]
> at java.util.concurrent.FutureTask.run(FutureTask.java:266) 
> ~[na:1.8.0_66-Tw8r9b1]
> at 
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)
>  ~[na:1.8.0_66-Tw8r9b1]
> at 
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
>  ~[na:1.8.0_66-Tw8r9b1]
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>  [na:1.8.0_66-Tw8r9b1]
> ... 2 common frames omitted
> {noformat}
> Similar exception occurs within the preemptor and causes the scheduler to 
> crash:
> {noformat}
> E0113 01:43:06.242 THREAD5037 
> com.google.common.util.concurrent.ServiceManager$ServiceListener.failed: 
> Service PreemptorService [FAILED] has failed in the RUNNING state.
> java.util.NoSuchElementException
> at com.google.common.collect.Iterables.getLast(Iterables.java:784)
> at 
> org.apache.aurora.scheduler.base.Tasks.getLatestEvent(Tasks.java:149)
> at 
> org.

[jira] [Commented] (AURORA-1580) java.util.NoSuchElementException from Tasks.getLatestEvent with DbTaskStore

2016-01-21 Thread Joshua Cohen (JIRA)

[ 
https://issues.apache.org/jira/browse/AURORA-1580?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15111038#comment-15111038
 ] 

Joshua Cohen commented on AURORA-1580:
--

{noformat}
I'd like to add, however, that implementing task mutations as insert/update 
rather than a full relational delete + re-insert would also address the issue.
{noformat}

This seems like a better solution to me. Is there a reason they were done as a 
cascading delete + re-insert to begin with? Was it just for the sake of not 
having to diff the task to see what had changed?

> java.util.NoSuchElementException from Tasks.getLatestEvent with DbTaskStore
> ---
>
> Key: AURORA-1580
> URL: https://issues.apache.org/jira/browse/AURORA-1580
> Project: Aurora
>  Issue Type: Bug
>  Components: Scheduler
>Reporter: Zameer Manji
>Assignee: Zameer Manji
>
> I have discovered the following exception from a scheduler that is running 
> off master with the beta task store enabled.
> {noformat}
> E0113 22:51:55.941 [AsyncProcessor-2, AsyncUtil:123] 
> java.util.concurrent.ExecutionException: java.util.NoSuchElementException 
> java.util.concurrent.ExecutionException: java.util.NoSuchElementException
> at java.util.concurrent.FutureTask.report(FutureTask.java:122) 
> ~[na:1.8.0_66-Tw8r9b1]
> at java.util.concurrent.FutureTask.get(FutureTask.java:192) 
> ~[na:1.8.0_66-Tw8r9b1]
> at 
> org.apache.aurora.scheduler.base.AsyncUtil.evaluateResult(AsyncUtil.java:118) 
> [aurora-110.jar:na]
> at 
> org.apache.aurora.scheduler.base.AsyncUtil.access$000(AsyncUtil.java:32) 
> [aurora-110.jar:na]
> at 
> org.apache.aurora.scheduler.base.AsyncUtil$1.afterExecute(AsyncUtil.java:59) 
> [aurora-110.jar:na]
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1150)
>  [na:1.8.0_66-Tw8r9b1]
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>  [na:1.8.0_66-Tw8r9b1]
> at java.lang.Thread.run(Thread.java:745) [na:1.8.0_66-Tw8r9b1]
> Caused by: java.util.NoSuchElementException: null
> at com.google.common.collect.Iterables.getLast(Iterables.java:784) 
> ~[guava-19.0.jar:na]
> at 
> org.apache.aurora.scheduler.base.Tasks.getLatestEvent(Tasks.java:149) 
> ~[aurora-110.jar:na]
> at org.apache.aurora.scheduler.base.Tasks$1.apply(Tasks.java:156) 
> ~[aurora-110.jar:na]
> at org.apache.aurora.scheduler.base.Tasks$1.apply(Tasks.java:153) 
> ~[aurora-110.jar:na]
> at 
> com.google.common.collect.ByFunctionOrdering.compare(ByFunctionOrdering.java:45)
>  ~[guava-19.0.jar:na]
> at java.util.TimSort.binarySort(TimSort.java:296) 
> ~[na:1.8.0_66-Tw8r9b1]
> at java.util.TimSort.sort(TimSort.java:239) ~[na:1.8.0_66-Tw8r9b1]
> at java.util.Arrays.sort(Arrays.java:1438) ~[na:1.8.0_66-Tw8r9b1]
> at com.google.common.collect.Ordering.sortedCopy(Ordering.java:860) 
> ~[guava-19.0.jar:na]
> at 
> org.apache.aurora.scheduler.pruning.TaskHistoryPruner.lambda$registerInactiveTask$20(TaskHistoryPruner.java:156)
>  ~[aurora-110.jar:na]
> at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) 
> ~[na:1.8.0_66-Tw8r9b1]
> at java.util.concurrent.FutureTask.run(FutureTask.java:266) 
> ~[na:1.8.0_66-Tw8r9b1]
> at 
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)
>  ~[na:1.8.0_66-Tw8r9b1]
> at 
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
>  ~[na:1.8.0_66-Tw8r9b1]
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>  [na:1.8.0_66-Tw8r9b1]
> ... 2 common frames omitted
> {noformat}
> Similar exception occurs within the preemptor and causes the scheduler to 
> crash:
> {noformat}
> E0113 01:43:06.242 THREAD5037 
> com.google.common.util.concurrent.ServiceManager$ServiceListener.failed: 
> Service PreemptorService [FAILED] has failed in the RUNNING state.
> java.util.NoSuchElementException
> at com.google.common.collect.Iterables.getLast(Iterables.java:784)
> at 
> org.apache.aurora.scheduler.base.Tasks.getLatestEvent(Tasks.java:149)
> at 
> org.apache.aurora.scheduler.preemptor.PendingTaskProcessor$3.apply(PendingTaskProcessor.java:237)
> at 
> org.apache.aurora.scheduler.preemptor.PendingTaskProcessor$3.apply(PendingTaskProcessor.java:234)
> at 
> com.google.common.base.Predicates$AndPredicate.apply(Predicates.java:374)
> at 
> com.google.common.collect.Iterators$7.computeNext(Iterators.java:675)
> at 
> com.google.common.collect.AbstractIterator.tryToComputeNex

[jira] [Commented] (AURORA-1580) java.util.NoSuchElementException from Tasks.getLatestEvent with DbTaskStore

2016-01-21 Thread Bill Farner (JIRA)

[ 
https://issues.apache.org/jira/browse/AURORA-1580?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15111031#comment-15111031
 ] 

Bill Farner commented on AURORA-1580:
-

I have determined why READ COMMITTED does not work in the above test case.  
Thankfully it is a SQL-level concurrency problem.

Here is our current sequence of events for saving a task:

1. {{DELETE FROM tasks WHERE task_id IN ( ? )}}
2. {{INSERT INTO tasks ...}}
3. {{INSERT INTO task_events ...}}
4. {{INSERT INTO task_ports ...}}

Here is the sequence for reading a task:

a. {{SELECT ... FROM tasks WHERE task_id = ?}}
b. {{SELECT ... FROM task_ports WHERE e.task_row_id = ?}}
c. {{SELECT ... FROM task_events WHERE e.task_row_id = ?}}

When READ COMMITTED behaves as expected, we can encounter this issue with a 
sequence like a1234bc, since operations b and c refer to a since-deleted 
{{task_row_id}}.  Therefore being functional with READ COMMITTED should just be 
a matter of using {{task_id}} consistently as the where clause.

I'd like to add, however, that implementing task mutations as insert/update 
rather than a full relational delete + re-insert would also address the issue.

> java.util.NoSuchElementException from Tasks.getLatestEvent with DbTaskStore
> ---
>
> Key: AURORA-1580
> URL: https://issues.apache.org/jira/browse/AURORA-1580
> Project: Aurora
>  Issue Type: Bug
>  Components: Scheduler
>Reporter: Zameer Manji
>Assignee: Zameer Manji
>
> I have discovered the following exception from a scheduler that is running 
> off master with the beta task store enabled.
> {noformat}
> E0113 22:51:55.941 [AsyncProcessor-2, AsyncUtil:123] 
> java.util.concurrent.ExecutionException: java.util.NoSuchElementException 
> java.util.concurrent.ExecutionException: java.util.NoSuchElementException
> at java.util.concurrent.FutureTask.report(FutureTask.java:122) 
> ~[na:1.8.0_66-Tw8r9b1]
> at java.util.concurrent.FutureTask.get(FutureTask.java:192) 
> ~[na:1.8.0_66-Tw8r9b1]
> at 
> org.apache.aurora.scheduler.base.AsyncUtil.evaluateResult(AsyncUtil.java:118) 
> [aurora-110.jar:na]
> at 
> org.apache.aurora.scheduler.base.AsyncUtil.access$000(AsyncUtil.java:32) 
> [aurora-110.jar:na]
> at 
> org.apache.aurora.scheduler.base.AsyncUtil$1.afterExecute(AsyncUtil.java:59) 
> [aurora-110.jar:na]
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1150)
>  [na:1.8.0_66-Tw8r9b1]
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>  [na:1.8.0_66-Tw8r9b1]
> at java.lang.Thread.run(Thread.java:745) [na:1.8.0_66-Tw8r9b1]
> Caused by: java.util.NoSuchElementException: null
> at com.google.common.collect.Iterables.getLast(Iterables.java:784) 
> ~[guava-19.0.jar:na]
> at 
> org.apache.aurora.scheduler.base.Tasks.getLatestEvent(Tasks.java:149) 
> ~[aurora-110.jar:na]
> at org.apache.aurora.scheduler.base.Tasks$1.apply(Tasks.java:156) 
> ~[aurora-110.jar:na]
> at org.apache.aurora.scheduler.base.Tasks$1.apply(Tasks.java:153) 
> ~[aurora-110.jar:na]
> at 
> com.google.common.collect.ByFunctionOrdering.compare(ByFunctionOrdering.java:45)
>  ~[guava-19.0.jar:na]
> at java.util.TimSort.binarySort(TimSort.java:296) 
> ~[na:1.8.0_66-Tw8r9b1]
> at java.util.TimSort.sort(TimSort.java:239) ~[na:1.8.0_66-Tw8r9b1]
> at java.util.Arrays.sort(Arrays.java:1438) ~[na:1.8.0_66-Tw8r9b1]
> at com.google.common.collect.Ordering.sortedCopy(Ordering.java:860) 
> ~[guava-19.0.jar:na]
> at 
> org.apache.aurora.scheduler.pruning.TaskHistoryPruner.lambda$registerInactiveTask$20(TaskHistoryPruner.java:156)
>  ~[aurora-110.jar:na]
> at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) 
> ~[na:1.8.0_66-Tw8r9b1]
> at java.util.concurrent.FutureTask.run(FutureTask.java:266) 
> ~[na:1.8.0_66-Tw8r9b1]
> at 
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)
>  ~[na:1.8.0_66-Tw8r9b1]
> at 
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
>  ~[na:1.8.0_66-Tw8r9b1]
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>  [na:1.8.0_66-Tw8r9b1]
> ... 2 common frames omitted
> {noformat}
> Similar exception occurs within the preemptor and causes the scheduler to 
> crash:
> {noformat}
> E0113 01:43:06.242 THREAD5037 
> com.google.common.util.concurrent.ServiceManager$ServiceListener.failed: 
> Service PreemptorService [FAILED] has failed in the RUNNING state.
> java.util.NoSuchElementException
> at com.google.common.col