[ 
https://issues.apache.org/jira/browse/AURORA-1507?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Maxim Khutornenko reassigned AURORA-1507:
-----------------------------------------

    Assignee: Maxim Khutornenko

> Instance collision stack trace in updater
> -----------------------------------------
>
>                 Key: AURORA-1507
>                 URL: https://issues.apache.org/jira/browse/AURORA-1507
>             Project: Aurora
>          Issue Type: Bug
>          Components: Scheduler
>            Reporter: Maxim Khutornenko
>            Assignee: Maxim Khutornenko
>
> I have spent some time investigating this error but still have no reasonable 
> leads. Dumping relevant logs here in case anyone encountered it or has better 
> understanding of what's going on here:
> {noformat}
> I0929 01:43:54.700 THREAD4752 
> org.apache.aurora.scheduler.updater.InstanceActionHandler$AddTask.getReevaluationDelay:
>  Adding instance IInstanceKey{jobKey=IJobKey{role=role, environment=prod, 
> name=job}, instanceId=26} while ROLLING_FORWARD
> I0929 01:43:08.902 THREAD5112 
> org.apache.aurora.common.util.StateMachine$Builder$1.execute: 
> 1443229138618-role-prod-job-13-41a49672-2698-4f82-b976-4e1e8edd2d5b state 
> machine transition KILLING -> KILLED
> I0929 01:43:08.902 THREAD5112 
> org.apache.aurora.scheduler.state.TaskStateMachine.addFollowup: Adding work 
> command SAVE_STATE for 
> 1443229138618-role-prod-job-13-41a49672-2698-4f82-b976-4e1e8edd2d5b
> I0929 01:43:48.543 THREAD4752 
> org.apache.aurora.scheduler.updater.JobUpdateControllerImpl$6.execute: 
> Forwarding task change for role/prod/job/25
> I0929 01:43:48.558 THREAD4752 
> org.apache.aurora.scheduler.updater.OneWayJobUpdater.evaluateInstances: 
> Ignoring state change for instance outside working set: 25
> I0929 01:43:50.982 THREAD4752 
> org.apache.aurora.scheduler.storage.mem.MemTaskStore.fetchTasks: Query took 
> 27 ms: TaskQuery(owner:null, role:role, environment:prod, jobName:job, 
> taskIds:null, statuses:[PENDING, DRAINING, ASSIGNED, STARTING, RUNNING, 
> PREEMPTING, RESTARTING, THROTTLED, KILLING], instanceIds:[13], 
> slaveHosts:null, jobKeys:null, offset:0, limit:0)
> I0929 01:43:50.983 THREAD4752 
> org.apache.aurora.common.util.StateMachine$Builder$1.execute: Instance 13 
> state machine transition IDLE -> WORKING
> I0929 01:43:54.667 THREAD4752 
> org.apache.aurora.scheduler.updater.InstanceActionHandler$AddTask.getReevaluationDelay:
>  Adding instance IInstanceKey{jobKey=IJobKey{role=role, environment=prod, 
> name=job}, instanceId=13} while ROLLING_FORWARD
> I0929 01:43:54.686 THREAD4752 
> org.apache.aurora.common.util.StateMachine$Builder$1.execute: 
> 1443491034668-role-prod-job-13-ee1cd4f8-a902-47d2-8aae-fca7d53930b9 state 
> machine transition INIT -> PENDING
> I0929 01:43:54.687 THREAD4752 
> org.apache.aurora.scheduler.state.TaskStateMachine.addFollowup: Adding work 
> command SAVE_STATE for 
> 1443491034668-role-prod-job-13-ee1cd4f8-a902-47d2-8aae-fca7d53930b9
> I0929 01:43:57.058 THREAD4751 
> org.apache.aurora.scheduler.updater.JobUpdateControllerImpl$6.execute: 
> Forwarding task change for role/prod/job/13
> I0929 01:43:58.159 THREAD4751 
> org.apache.aurora.scheduler.updater.JobUpdateControllerImpl.evaluateUpdater: 
> Executing side-effects for update of IJobUpdateKey{job=IJobKey{role=role, 
> environment=prod, name=job}, id=3042db11-c6f9-4344-a2c0-7e584a49acc4}: 
> {13=SideEffect{action=Optional.of(ADD_TASK), statusChanges=[]}
> I0929 01:43:58.160 THREAD4751 
> org.apache.aurora.scheduler.updater.InstanceActionHandler$AddTask.getReevaluationDelay:
>  Adding instance IInstanceKey{jobKey=IJobKey{role=role, environment=prod, 
> name=job}, instanceId=13} while ROLLING_FORWARD
> E0929 01:43:58.178 THREAD4751 
> org.apache.aurora.scheduler.updater.JobUpdateEventSubscriber.taskChangedState:
>  Failed to handle state change: java.lang.IllegalArgumentException: Instance 
> ID collision detected.
> java.lang.IllegalArgumentException: Instance ID collision detected.
>         at 
> org.apache.aurora.scheduler.state.StateManagerImpl.insertPendingTasks(StateManagerImpl.java:138)
>         at 
> org.apache.aurora.scheduler.updater.InstanceActionHandler$AddTask.getReevaluationDelay(InstanceActionHandler.java:85)
>         at 
> org.apache.aurora.scheduler.updater.JobUpdateControllerImpl.evaluateUpdater(JobUpdateControllerImpl.java:668)
>         at 
> org.apache.aurora.scheduler.updater.JobUpdateControllerImpl.access$1400(JobUpdateControllerImpl.java:108)
>         at 
> org.apache.aurora.scheduler.updater.JobUpdateControllerImpl$6.execute(JobUpdateControllerImpl.java:356)
>         at 
> org.apache.aurora.scheduler.storage.Storage$MutateWork$NoResult.apply(Storage.java:137)
>         at 
> org.apache.aurora.scheduler.storage.Storage$MutateWork$NoResult.apply(Storage.java:132)
>         at 
> org.apache.aurora.scheduler.storage.log.LogStorage$24.apply(LogStorage.java:614)
>         at 
> org.apache.aurora.scheduler.storage.log.LogStorage$24.apply(LogStorage.java:611)
>         at 
> org.apache.aurora.scheduler.storage.db.DbStorage.transactionedWrite(DbStorage.java:146)
>         at 
> org.mybatis.guice.transactional.TransactionalMethodInterceptor.invoke(TransactionalMethodInterceptor.java:101)
>         at 
> org.apache.aurora.scheduler.storage.db.DbStorage$2.doWithGateClosed(DbStorage.java:162)
>         at 
> org.apache.aurora.scheduler.async.GatingDelayExecutor.closeDuring(GatingDelayExecutor.java:62)
>         at 
> org.apache.aurora.scheduler.storage.db.DbStorage.write(DbStorage.java:158)
>         at 
> org.apache.aurora.common.inject.TimedInterceptor.invoke(TimedInterceptor.java:84)
>         at 
> org.apache.aurora.scheduler.storage.log.LogStorage.doInTransaction(LogStorage.java:611)
>         at 
> org.apache.aurora.scheduler.storage.log.LogStorage.write(LogStorage.java:644)
>         at 
> org.apache.aurora.scheduler.storage.CallOrderEnforcingStorage.write(CallOrderEnforcingStorage.java:130)
>         at 
> org.apache.aurora.scheduler.updater.JobUpdateControllerImpl.instanceChanged(JobUpdateControllerImpl.java:347)
>         at 
> org.apache.aurora.scheduler.updater.JobUpdateControllerImpl.instanceChangedState(JobUpdateControllerImpl.java:332)
>         at 
> org.apache.aurora.scheduler.updater.JobUpdateEventSubscriber.taskChangedState(JobUpdateEventSubscriber.java:56)
>         at sun.reflect.GeneratedMethodAccessor121.invoke(Unknown Source)
>         at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>         at java.lang.reflect.Method.invoke(Method.java:497)
>         at 
> com.google.common.eventbus.EventSubscriber.handleEvent(EventSubscriber.java:74)
>         at 
> com.google.common.eventbus.SynchronizedEventSubscriber.handleEvent(SynchronizedEventSubscriber.java:47)
>         at com.google.common.eventbus.EventBus.dispatch(EventBus.java:322)
>         at 
> com.google.common.eventbus.AsyncEventBus.access$001(AsyncEventBus.java:34)
>         at 
> com.google.common.eventbus.AsyncEventBus$1.run(AsyncEventBus.java:117)
>         at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
>         at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>         at 
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)
>         at 
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
>         at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>         at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>         at java.lang.Thread.run(Thread.java:745)
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to