[jira] [Created] (AURORA-1594) End-to-end test is broken
Bill Farner created AURORA-1594: --- Summary: End-to-end test is broken Key: AURORA-1594 URL: https://issues.apache.org/jira/browse/AURORA-1594 Project: Aurora Issue Type: Bug Components: Scheduler Reporter: Bill Farner Priority: Blocker {noformat} + aurora job create devcluster/vagrant/test/http_example /vagrant/src/test/sh/org/apache/aurora/e2e/http/http_example.aurora WARN] WARNING: endpoint, expected_response, and expected_response_code are deprecated and will be removed in the next release. Please consult updated documentation. INFO] Creating job http_example WARN] Could not connect to scheduler: No schedulers detected in devcluster! WARN] Could not connect to scheduler: No schedulers detected in devcluster! Job creation failed due to error: java.lang.IllegalArgumentException: Multiple entries with same key: ITaskConfig{job=IJobKey{role=vagrant, environment=test, name=http_example}, owner=IIdentity{role=null, user=vagrant}, environment=null, jobName=null, isService=true, numCpus=0.4, ramMb=32, diskMb=64, priority=0, maxTaskFailures=1, production=false, tier=null, constraints=[], requestedPorts=[http], taskLinks={http=http://%host%:%port:http%}, contactEmail=vagrant@localhost, executorConfig=IExecutorConfig{name=AuroraExecutor, data={"environment": "test", "health_check_config": {"expected_response_code": 0, "endpoint": "/health", "health_checker": {"http": {"expected_response_code": 0, "endpoint": "/health", "expected_response": "ok"}}, "initial_interval_secs": 5.0, "expected_response": "ok", "max_consecutive_failures": 0, "timeout_secs": 1.0, "interval_secs": 1.0}, "name": "http_example", "service": true, "max_task_failures": 1, "cron_collision_policy": "KILL_EXISTING", "enable_hooks": false, "cluster": "devcluster", "task": {"processes": [{"daemon": false, "name": "stage_server", "ephemeral": false, "max_failures": 1, "min_duration": 5, "cmdline": "cp /vagrant/src/test/sh/org/apache/aurora/e2e/http_example.py .", "final": false}, {"daemon": false, "name": "run_server", "ephemeral": false, "max_failures": 1, "min_duration": 5, "cmdline": "python http_example.py {{thermos.ports[http]}}", "final": false}], "name": "http_example", "finalization_wait": 30, "max_failures": 1, "max_concurrency": 0, "resources": {"disk": 67108864, "ram": 33554432, "cpu": 0.4}, "constraints": [{"order": ["stage_server", "run_server"]}]}, "production": false, "role": "vagrant", "contact": "vagrant@localhost", "announce": {"primary_port": "http", "portmap": {"aurora": "http"}}, "lifecycle": {"http": {"graceful_shutdown_endpoint": "/quitquitquit", "port": "health", "shutdown_endpoint": "/abortabortabort"}}, "priority": 0}}, metadata=[], container=IContainer{setField=MESOS, value=IMesosContainer{}}}=org.apache.aurora.scheduler.storage.db.views.DbTaskConfig@7b345c31 and ITaskConfig{job=IJobKey{role=vagrant, environment=test, name=http_example}, owner=IIdentity{role=null, user=vagrant}, environment=null, jobName=null, isService=true, numCpus=0.4, ramMb=32, diskMb=64, priority=0, maxTaskFailures=1, production=false, tier=null, constraints=[], requestedPorts=[http], taskLinks={http=http://%host%:%port:http%}, contactEmail=vagrant@localhost, executorConfig=IExecutorConfig{name=AuroraExecutor, data={"environment": "test", "health_check_config": {"expected_response_code": 0, "endpoint": "/health", "health_checker": {"http": {"expected_response_code": 0, "endpoint": "/health", "expected_response": "ok"}}, "initial_interval_secs": 5.0, "expected_response": "ok", "max_consecutive_failures": 0, "timeout_secs": 1.0, "interval_secs": 1.0}, "name": "http_example", "service": true, "max_task_failures": 1, "cron_collision_policy": "KILL_EXISTING", "enable_hooks": false, "cluster": "devcluster", "task": {"processes": [{"daemon": false, "name": "stage_server", "ephemeral": false, "max_failures": 1, "min_duration": 5, "cmdline": "cp /vagrant/src/test/sh/org/apache/aurora/e2e/http_example.py .", "final": false}, {"daemon": false, "name": "run_server", "ephemeral": false, "max_failures": 1, "min_duration": 5, "cmdline": "python http_example.py {{thermos.ports[http]}}", "final": false}], "name": "http_example", "finalization_wait": 30, "max_failures": 1, "max_concurrency": 0, "resources": {"disk": 67108864, "ram": 33554432, "cpu": 0.4}, "constraints": [{"order": ["stage_server", "run_server"]}]}, "production": false, "role": "vagrant", "contact": "vagrant@localhost", "announce": {"primary_port": "http", "portmap": {"aurora": "http"}}, "lifecycle": {"http": {"graceful_shutdown_endpoint": "/quitquitquit", "port": "health", "shutdown_endpoint": "/abortabortabort"}}, "priority": 0}}, metadata=[], container=IContainer{setField=MESOS, value=IMesosContainer{}}}=org.apache.aurora.scheduler.storage.db.views.DbTaskConfig@7ac8690c. To index multiple
[jira] [Updated] (AURORA-1594) End-to-end test is broken
[ https://issues.apache.org/jira/browse/AURORA-1594?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bill Farner updated AURORA-1594: Description: {noformat} + aurora job create devcluster/vagrant/test/http_example /vagrant/src/test/sh/org/apache/aurora/e2e/http/http_example.aurora WARN] WARNING: endpoint, expected_response, and expected_response_code are deprecated and will be removed in the next release. Please consult updated documentation. INFO] Creating job http_example WARN] Could not connect to scheduler: No schedulers detected in devcluster! WARN] Could not connect to scheduler: No schedulers detected in devcluster! Job creation failed due to error: java.lang.IllegalArgumentException: Multiple entries with same key: ITaskConfig{job=IJobKey{role=vagrant, environment=test, name=http_example}, owner=IIdentity{role=null, user=vagrant}, environment=null, jobName=null, isService=true, numCpus=0.4, ramMb=32, diskMb=64, priority=0, maxTaskFailures=1, production=false, tier=null, constraints=[], requestedPorts=[http], taskLinks={http=http://%host%:%port:http%}, contactEmail=vagrant@localhost, executorConfig=IExecutorConfig{name=AuroraExecutor, data={"environment": "test", "health_check_config": {"expected_response_code": 0, "endpoint": "/health", "health_checker": {"http": {"expected_response_code": 0, "endpoint": "/health", "expected_response": "ok"}}, "initial_interval_secs": 5.0, "expected_response": "ok", "max_consecutive_failures": 0, "timeout_secs": 1.0, "interval_secs": 1.0}, "name": "http_example", "service": true, "max_task_failures": 1, "cron_collision_policy": "KILL_EXISTING", "enable_hooks": false, "cluster": "devcluster", "task": {"processes": [{"daemon": false, "name": "stage_server", "ephemeral": false, "max_failures": 1, "min_duration": 5, "cmdline": "cp /vagrant/src/test/sh/org/apache/aurora/e2e/http_example.py .", "final": false}, {"daemon": false, "name": "run_server", "ephemeral": false, "max_failures": 1, "min_duration": 5, "cmdline": "python http_example.py {{thermos.ports[http]}}", "final": false}], "name": "http_example", "finalization_wait": 30, "max_failures": 1, "max_concurrency": 0, "resources": {"disk": 67108864, "ram": 33554432, "cpu": 0.4}, "constraints": [{"order": ["stage_server", "run_server"]}]}, "production": false, "role": "vagrant", "contact": "vagrant@localhost", "announce": {"primary_port": "http", "portmap": {"aurora": "http"}}, "lifecycle": {"http": {"graceful_shutdown_endpoint": "/quitquitquit", "port": "health", "shutdown_endpoint": "/abortabortabort"}}, "priority": 0}}, metadata=[], container=IContainer{setField=MESOS, value=IMesosContainer{}}}=org.apache.aurora.scheduler.storage.db.views.DbTaskConfig@7b345c31 and ITaskConfig{job=IJobKey{role=vagrant, environment=test, name=http_example}, owner=IIdentity{role=null, user=vagrant}, environment=null, jobName=null, isService=true, numCpus=0.4, ramMb=32, diskMb=64, priority=0, maxTaskFailures=1, production=false, tier=null, constraints=[], requestedPorts=[http], taskLinks={http=http://%host%:%port:http%}, contactEmail=vagrant@localhost, executorConfig=IExecutorConfig{name=AuroraExecutor, data={"environment": "test", "health_check_config": {"expected_response_code": 0, "endpoint": "/health", "health_checker": {"http": {"expected_response_code": 0, "endpoint": "/health", "expected_response": "ok"}}, "initial_interval_secs": 5.0, "expected_response": "ok", "max_consecutive_failures": 0, "timeout_secs": 1.0, "interval_secs": 1.0}, "name": "http_example", "service": true, "max_task_failures": 1, "cron_collision_policy": "KILL_EXISTING", "enable_hooks": false, "cluster": "devcluster", "task": {"processes": [{"daemon": false, "name": "stage_server", "ephemeral": false, "max_failures": 1, "min_duration": 5, "cmdline": "cp /vagrant/src/test/sh/org/apache/aurora/e2e/http_example.py .", "final": false}, {"daemon": false, "name": "run_server", "ephemeral": false, "max_failures": 1, "min_duration": 5, "cmdline": "python http_example.py {{thermos.ports[http]}}", "final": false}], "name": "http_example", "finalization_wait": 30, "max_failures": 1, "max_concurrency": 0, "resources": {"disk": 67108864, "ram": 33554432, "cpu": 0.4}, "constraints": [{"order": ["stage_server", "run_server"]}]}, "production": false, "role": "vagrant", "contact": "vagrant@localhost", "announce": {"primary_port": "http", "portmap": {"aurora": "http"}}, "lifecycle": {"http": {"graceful_shutdown_endpoint": "/quitquitquit", "port": "health", "shutdown_endpoint": "/abortabortabort"}}, "priority": 0}}, metadata=[], container=IContainer{setField=MESOS, value=IMesosContainer{}}}=org.apache.aurora.scheduler.storage.db.views.DbTaskConfig@7ac8690c. To index multiple values under a key, use Multimaps.index. + collect_result + [[ 1 = 0 ]] + echo '!!! FAIL (something returned non-zero) for [[ $RETCODE = 0 ]]' {noformat} Stack trace:
[jira] [Commented] (AURORA-1052) Populate Labels in TaskConfig
[ https://issues.apache.org/jira/browse/AURORA-1052?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15115697#comment-15115697 ] Stephan Erb commented on AURORA-1052: - I'll try to look into that in a couple of days. Idea would be to have a command line flag defaulting to `org.apache.aurora.metadata` that will be used as a prefix for any metadata entry mapped to a label. This prefix could be changed globally by the cluster administrator to `com.myorganization` but could also be set to empty if he wants to leave it up to the user to decide. What do you think? > Populate Labels in TaskConfig > - > > Key: AURORA-1052 > URL: https://issues.apache.org/jira/browse/AURORA-1052 > Project: Aurora > Issue Type: Story > Components: Scheduler >Reporter: Stephan Erb >Priority: Minor > Labels: newbie > > Mesos has introduced labels on tasks (MESOS-2120). These correspond to what > Aurora calls metadata. > We should therefore set task labels according to our metadata information. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (AURORA-1052) Populate Labels in TaskConfig
[ https://issues.apache.org/jira/browse/AURORA-1052?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15115712#comment-15115712 ] Stephan Erb commented on AURORA-1052: - I'd think the enduser would not need an additional interface. He could simply could simply use an appropriate metadata key himself. > Populate Labels in TaskConfig > - > > Key: AURORA-1052 > URL: https://issues.apache.org/jira/browse/AURORA-1052 > Project: Aurora > Issue Type: Story > Components: Scheduler >Reporter: Stephan Erb >Priority: Minor > Labels: newbie > > Mesos has introduced labels on tasks (MESOS-2120). These correspond to what > Aurora calls metadata. > We should therefore set task labels according to our metadata information. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Issue Comment Deleted] (AURORA-1052) Populate Labels in TaskConfig
[ https://issues.apache.org/jira/browse/AURORA-1052?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Stephan Erb updated AURORA-1052: Comment: was deleted (was: I'll try to look into that in a couple of days. Idea would be to have a command line flag defaulting to `org.apache.aurora.metadata` that will be used as a prefix for any metadata entry mapped to a label. This prefix could be changed globally by the cluster administrator to `com.myorganization` but could also be set to empty if he wants to leave it up to the user to decide. What do you think?) > Populate Labels in TaskConfig > - > > Key: AURORA-1052 > URL: https://issues.apache.org/jira/browse/AURORA-1052 > Project: Aurora > Issue Type: Story > Components: Scheduler >Reporter: Stephan Erb >Priority: Minor > Labels: newbie > > Mesos has introduced labels on tasks (MESOS-2120). These correspond to what > Aurora calls metadata. > We should therefore set task labels according to our metadata information. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Assigned] (AURORA-1254) Remove UpdateConfig.restart_threshold
[ https://issues.apache.org/jira/browse/AURORA-1254?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] John Sirois reassigned AURORA-1254: --- Assignee: John Sirois > Remove UpdateConfig.restart_threshold > - > > Key: AURORA-1254 > URL: https://issues.apache.org/jira/browse/AURORA-1254 > Project: Aurora > Issue Type: Task > Components: Client >Reporter: Bill Farner >Assignee: John Sirois >Priority: Minor > > This field has been deprecated as it no longer does anything. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (AURORA-1052) Populate Labels in TaskConfig
[ https://issues.apache.org/jira/browse/AURORA-1052?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15115677#comment-15115677 ] Zhitao Li commented on AURORA-1052: --- One more comment: Mesos community is planning on populating Docker labels from Mesos labels in https://issues.apache.org/jira/browse/MESOS-4446, so following #2 probably makes more sense to avoid conflicts. > Populate Labels in TaskConfig > - > > Key: AURORA-1052 > URL: https://issues.apache.org/jira/browse/AURORA-1052 > Project: Aurora > Issue Type: Story > Components: Scheduler >Reporter: Stephan Erb >Priority: Minor > Labels: newbie > > Mesos has introduced labels on tasks (MESOS-2120). These correspond to what > Aurora calls metadata. > We should therefore set task labels according to our metadata information. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (AURORA-1052) Populate Labels in TaskConfig
[ https://issues.apache.org/jira/browse/AURORA-1052?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15115698#comment-15115698 ] Stephan Erb commented on AURORA-1052: - I'll try to look into that in a couple of days. Idea would be to have a command line flag defaulting to `org.apache.aurora.metadata` that will be used as a prefix for any metadata entry mapped to a label. This prefix could be changed globally by the cluster administrator to `com.myorganization` but could also be set to empty if he wants to leave it up to the user to decide. What do you think? > Populate Labels in TaskConfig > - > > Key: AURORA-1052 > URL: https://issues.apache.org/jira/browse/AURORA-1052 > Project: Aurora > Issue Type: Story > Components: Scheduler >Reporter: Stephan Erb >Priority: Minor > Labels: newbie > > Mesos has introduced labels on tasks (MESOS-2120). These correspond to what > Aurora calls metadata. > We should therefore set task labels according to our metadata information. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Assigned] (AURORA-1563) Deprecate endpoint, expected_response and expected_response_code from HealthCheckConfig
[ https://issues.apache.org/jira/browse/AURORA-1563?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] John Sirois reassigned AURORA-1563: --- Assignee: John Sirois > Deprecate endpoint, expected_response and expected_response_code from > HealthCheckConfig > --- > > Key: AURORA-1563 > URL: https://issues.apache.org/jira/browse/AURORA-1563 > Project: Aurora > Issue Type: Story >Reporter: Dmitriy Shirchenko >Assignee: John Sirois > Fix For: 0.12.0 > > > For example, remove deprecated code from health_checker.py and config.py > which supports 2 ways of getting attributes listed in the title of this task. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Assigned] (AURORA-1594) End-to-end test is broken
[ https://issues.apache.org/jira/browse/AURORA-1594?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bill Farner reassigned AURORA-1594: --- Assignee: Bill Farner > End-to-end test is broken > - > > Key: AURORA-1594 > URL: https://issues.apache.org/jira/browse/AURORA-1594 > Project: Aurora > Issue Type: Bug > Components: Scheduler >Reporter: Bill Farner >Assignee: Bill Farner >Priority: Blocker > > {noformat} > + aurora job create devcluster/vagrant/test/http_example > /vagrant/src/test/sh/org/apache/aurora/e2e/http/http_example.aurora > WARN] > WARNING: endpoint, expected_response, and expected_response_code are > deprecated and will be removed > in the next release. Please consult updated documentation. > INFO] Creating job http_example > WARN] Could not connect to scheduler: No schedulers detected in devcluster! > WARN] Could not connect to scheduler: No schedulers detected in devcluster! > Job creation failed due to error: > java.lang.IllegalArgumentException: Multiple entries with same key: > ITaskConfig{job=IJobKey{role=vagrant, environment=test, name=http_example}, > owner=IIdentity{role=null, user=vagrant}, environment=null, jobName=null, > isService=true, numCpus=0.4, ramMb=32, diskMb=64, priority=0, > maxTaskFailures=1, production=false, tier=null, constraints=[], > requestedPorts=[http], taskLinks={http=http://%host%:%port:http%}, > contactEmail=vagrant@localhost, > executorConfig=IExecutorConfig{name=AuroraExecutor, data={"environment": > "test", "health_check_config": {"expected_response_code": 0, "endpoint": > "/health", "health_checker": {"http": {"expected_response_code": 0, > "endpoint": "/health", "expected_response": "ok"}}, "initial_interval_secs": > 5.0, "expected_response": "ok", "max_consecutive_failures": 0, > "timeout_secs": 1.0, "interval_secs": 1.0}, "name": "http_example", > "service": true, "max_task_failures": 1, "cron_collision_policy": > "KILL_EXISTING", "enable_hooks": false, "cluster": "devcluster", "task": > {"processes": [{"daemon": false, "name": "stage_server", "ephemeral": false, > "max_failures": 1, "min_duration": 5, "cmdline": "cp > /vagrant/src/test/sh/org/apache/aurora/e2e/http_example.py .", "final": > false}, {"daemon": false, "name": "run_server", "ephemeral": false, > "max_failures": 1, "min_duration": 5, "cmdline": "python http_example.py > {{thermos.ports[http]}}", "final": false}], "name": "http_example", > "finalization_wait": 30, "max_failures": 1, "max_concurrency": 0, > "resources": {"disk": 67108864, "ram": 33554432, "cpu": 0.4}, "constraints": > [{"order": ["stage_server", "run_server"]}]}, "production": false, "role": > "vagrant", "contact": "vagrant@localhost", "announce": {"primary_port": > "http", "portmap": {"aurora": "http"}}, "lifecycle": {"http": > {"graceful_shutdown_endpoint": "/quitquitquit", "port": "health", > "shutdown_endpoint": "/abortabortabort"}}, "priority": 0}}, metadata=[], > container=IContainer{setField=MESOS, > value=IMesosContainer{}}}=org.apache.aurora.scheduler.storage.db.views.DbTaskConfig@7b345c31 > and ITaskConfig{job=IJobKey{role=vagrant, environment=test, > name=http_example}, owner=IIdentity{role=null, user=vagrant}, > environment=null, jobName=null, isService=true, numCpus=0.4, ramMb=32, > diskMb=64, priority=0, maxTaskFailures=1, production=false, tier=null, > constraints=[], requestedPorts=[http], > taskLinks={http=http://%host%:%port:http%}, contactEmail=vagrant@localhost, > executorConfig=IExecutorConfig{name=AuroraExecutor, data={"environment": > "test", "health_check_config": {"expected_response_code": 0, "endpoint": > "/health", "health_checker": {"http": {"expected_response_code": 0, > "endpoint": "/health", "expected_response": "ok"}}, "initial_interval_secs": > 5.0, "expected_response": "ok", "max_consecutive_failures": 0, > "timeout_secs": 1.0, "interval_secs": 1.0}, "name": "http_example", > "service": true, "max_task_failures": 1, "cron_collision_policy": > "KILL_EXISTING", "enable_hooks": false, "cluster": "devcluster", "task": > {"processes": [{"daemon": false, "name": "stage_server", "ephemeral": false, > "max_failures": 1, "min_duration": 5, "cmdline": "cp > /vagrant/src/test/sh/org/apache/aurora/e2e/http_example.py .", "final": > false}, {"daemon": false, "name": "run_server", "ephemeral": false, > "max_failures": 1, "min_duration": 5, "cmdline": "python http_example.py > {{thermos.ports[http]}}", "final": false}], "name": "http_example", > "finalization_wait": 30, "max_failures": 1, "max_concurrency": 0, > "resources": {"disk": 67108864, "ram": 33554432, "cpu": 0.4}, "constraints": > [{"order": ["stage_server", "run_server"]}]}, "production": false, "role": > "vagrant", "contact": "vagrant@localhost", "announce": {"primary_port": > "http",
[jira] [Commented] (AURORA-1258) Improve procedure for adding instances to a job
[ https://issues.apache.org/jira/browse/AURORA-1258?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15116278#comment-15116278 ] Maxim Khutornenko commented on AURORA-1258: --- Final details are captured here: http://markmail.org/message/2smaej5n5e54li3g > Improve procedure for adding instances to a job > --- > > Key: AURORA-1258 > URL: https://issues.apache.org/jira/browse/AURORA-1258 > Project: Aurora > Issue Type: Story > Components: Reliability, Usability >Reporter: Joe Smith >Assignee: Maxim Khutornenko > > The current process for adding instances to a job is highly manual, and > potentially dangerous. > 1. Take a config for a job with 10 instances, update it to 20 instances. > 2. The batch size will be increased, and users will need to specify shards 10 > to 19. > 3. After this update is complete, users will need to manually update shards > 0-9 again. > There may be other changes pulled in as part of this update other than just > increasing the number of instances, which could further complicate things. > One possible improvement would be to change the updater from > 'under-provision' where it kills instances first, then schedules new > instances, to an 'over-provision' where it adds on new instances, then > backpedals and kills the old instances. > Overall, a single command or process for a user to take an already-existing > job and increase the number of instances would reduce overhead and > fat-fingering. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (AURORA-1593) PubSubEventModule fails to dispatch events to TaskHistoryPruner on startup
[ https://issues.apache.org/jira/browse/AURORA-1593?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15116291#comment-15116291 ] John Sirois commented on AURORA-1593: - Just noting that the interaction here is more complex than it may 1st appear since the prunes are executed via a DelayExecutor which is gated (work units are queued), during storage start (log recovery). Still digging a bit to make sure I have this all sussed pre-Zameer's change, with Zameer's change and with my change. > PubSubEventModule fails to dispatch events to TaskHistoryPruner on startup > -- > > Key: AURORA-1593 > URL: https://issues.apache.org/jira/browse/AURORA-1593 > Project: Aurora > Issue Type: Bug >Reporter: Zameer Manji >Assignee: John Sirois > > On latest master I see several exceptions that look like: > {noformat} > E0122 22:59:19.272 [AsyncProcessor-7, PubsubEventModule:84] Failed to > dispatch event to public void > org.apache.aurora.scheduler.pruning.TaskHistoryPruner.recordStateChange(org.apache.aurora.scheduler.events.PubsubEvent$TaskStateChange): > java.lang.IllegalStateException j > ava.lang.IllegalStateException: null > at > com.google.common.base.Preconditions.checkState(Preconditions.java:159) > ~[guava-19.0.jar:na] > at > org.apache.aurora.scheduler.pruning.TaskHistoryPruner.recordStateChange(TaskHistoryPruner.java:117) > ~[aurora-116.jar:na] > at sun.reflect.GeneratedMethodAccessor116.invoke(Unknown Source) > ~[na:na] > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > ~[na:1.8.0_66-Tw8r9b2] > at java.lang.reflect.Method.invoke(Method.java:497) > ~[na:1.8.0_66-Tw8r9b2] > at > com.google.common.eventbus.Subscriber.invokeSubscriberMethod(Subscriber.java:95) > ~[guava-19.0.jar:na] > at > com.google.common.eventbus.Subscriber$SynchronizedSubscriber.invokeSubscriberMethod(Subscriber.java:154) > ~[guava-19.0.jar:na] > at com.google.common.eventbus.Subscriber$1.run(Subscriber.java:80) > ~[guava-19.0.jar:na] > at > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) > ~[na:1.8.0_66-Tw8r9b2] > at java.util.concurrent.FutureTask.run(FutureTask.java:266) > ~[na:1.8.0_66-Tw8r9b2] > at > java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180) > ~[na:1.8.0_66-Tw8r9b2] > at > java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293) > ~[na:1.8.0_66-Tw8r9b2] > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) > ~[na:1.8.0_66-Tw8r9b2] > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) > ~[na:1.8.0_66-Tw8r9b2] > at java.lang.Thread.run(Thread.java:745) ~[na:1.8.0_66-Tw8r9b2] > {noformat} > The problem is that {{TaskHistoryPruner}} assumes it is started before the > event bus starts sending events to the service. This appears to not be the > case. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (AURORA-1593) PubSubEventModule fails to dispatch events to TaskHistoryPruner on startup
[ https://issues.apache.org/jira/browse/AURORA-1593?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15116365#comment-15116365 ] John Sirois commented on AURORA-1593: - OK - still sorting through the {{DelayExecutor}} gating, but it does not look like it gates any {{TaskStateChange}} events - the events consumed by {{TaskHistoryPruner}}. > PubSubEventModule fails to dispatch events to TaskHistoryPruner on startup > -- > > Key: AURORA-1593 > URL: https://issues.apache.org/jira/browse/AURORA-1593 > Project: Aurora > Issue Type: Bug >Reporter: Zameer Manji >Assignee: John Sirois > > On latest master I see several exceptions that look like: > {noformat} > E0122 22:59:19.272 [AsyncProcessor-7, PubsubEventModule:84] Failed to > dispatch event to public void > org.apache.aurora.scheduler.pruning.TaskHistoryPruner.recordStateChange(org.apache.aurora.scheduler.events.PubsubEvent$TaskStateChange): > java.lang.IllegalStateException j > ava.lang.IllegalStateException: null > at > com.google.common.base.Preconditions.checkState(Preconditions.java:159) > ~[guava-19.0.jar:na] > at > org.apache.aurora.scheduler.pruning.TaskHistoryPruner.recordStateChange(TaskHistoryPruner.java:117) > ~[aurora-116.jar:na] > at sun.reflect.GeneratedMethodAccessor116.invoke(Unknown Source) > ~[na:na] > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > ~[na:1.8.0_66-Tw8r9b2] > at java.lang.reflect.Method.invoke(Method.java:497) > ~[na:1.8.0_66-Tw8r9b2] > at > com.google.common.eventbus.Subscriber.invokeSubscriberMethod(Subscriber.java:95) > ~[guava-19.0.jar:na] > at > com.google.common.eventbus.Subscriber$SynchronizedSubscriber.invokeSubscriberMethod(Subscriber.java:154) > ~[guava-19.0.jar:na] > at com.google.common.eventbus.Subscriber$1.run(Subscriber.java:80) > ~[guava-19.0.jar:na] > at > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) > ~[na:1.8.0_66-Tw8r9b2] > at java.util.concurrent.FutureTask.run(FutureTask.java:266) > ~[na:1.8.0_66-Tw8r9b2] > at > java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180) > ~[na:1.8.0_66-Tw8r9b2] > at > java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293) > ~[na:1.8.0_66-Tw8r9b2] > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) > ~[na:1.8.0_66-Tw8r9b2] > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) > ~[na:1.8.0_66-Tw8r9b2] > at java.lang.Thread.run(Thread.java:745) ~[na:1.8.0_66-Tw8r9b2] > {noformat} > The problem is that {{TaskHistoryPruner}} assumes it is started before the > event bus starts sending events to the service. This appears to not be the > case. -- This message was sent by Atlassian JIRA (v6.3.4#6332)