[ 
https://issues.apache.org/jira/browse/AURORA-1603?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Maxim Khutornenko updated AURORA-1603:
--------------------------------------
    Sprint: Twitter Aurora Q1'16 Sprint 18

> Scheduler fails to start after rollback
> ---------------------------------------
>
>                 Key: AURORA-1603
>                 URL: https://issues.apache.org/jira/browse/AURORA-1603
>             Project: Aurora
>          Issue Type: Bug
>          Components: Scheduler
>            Reporter: Maxim Khutornenko
>            Assignee: Maxim Khutornenko
>            Priority: Critical
>
> We had to rollback scheduler due to the duplicate instances in the UI and 
> when tried to restart on the older version 
> (8d3fb2413306387bc533b1b800bbc97149f96b26) got the following error preventing 
> scheduler from loading snapshot:
> {noformat}
> To index multiple values under a key, use Multimaps.index.
>         at com.google.common.collect.Maps.uniqueIndex(Maps.java:1215) 
> ~[guava-19.0.jar:na]
>         at com.google.common.collect.Maps.uniqueIndex(Maps.java:1173) 
> ~[guava-19.0.jar:na]
>         at 
> org.apache.aurora.scheduler.storage.db.TaskConfigManager.getConfigRow(TaskConfigManager.java:46)
>  ~[aurora-113.jar:na]
>         at 
> org.apache.aurora.scheduler.storage.db.TaskConfigManager.insert(TaskConfigManager.java:57)
>  ~[aurora-113.jar:na]
>         at 
> org.apache.aurora.scheduler.storage.db.DbJobUpdateStore.saveJobUpdate(DbJobUpdateStore.java:125)
>  ~[aurora-113.jar:na]
>         at 
> org.apache.aurora.common.inject.TimedInterceptor.invoke(TimedInterceptor.java:83)
>  ~[commons-113.jar:na]
>         at 
> org.apache.aurora.scheduler.storage.log.SnapshotStoreImpl$7.restoreFromSnapshot(SnapshotStoreImpl.java:208)
>  ~[aurora-113.jar:na]
>         at 
> org.apache.aurora.scheduler.storage.log.SnapshotStoreImpl.lambda$applySnapshot$238(SnapshotStoreImpl.java:278)
>  ~[aurora-113.jar:na]
>         at 
> org.apache.aurora.scheduler.storage.Storage$MutateWork$NoResult.apply(Storage.java:137)
>  ~[aurora-113.jar:na]
>         at 
> org.apache.aurora.scheduler.storage.Storage$MutateWork$NoResult.apply(Storage.java:132)
>  ~[aurora-113.jar:na]
>         at 
> org.apache.aurora.scheduler.storage.db.DbStorage.transactionedWrite(DbStorage.java:146)
>  ~[aurora-113.jar:na]
>         at 
> org.mybatis.guice.transactional.TransactionalMethodInterceptor.invoke(TransactionalMethodInterceptor.java:101)
>  ~[mybatis-guice-3.7.jar:3.7]
>         at 
> org.apache.aurora.scheduler.storage.db.DbStorage.lambda$write$203(DbStorage.java:160)
>  ~[aurora-113.jar:na]
>         at 
> org.apache.aurora.scheduler.async.GatingDelayExecutor.closeDuring(GatingDelayExecutor.java:62)
>  ~[aurora-113.jar:na]
>         at 
> org.apache.aurora.scheduler.storage.db.DbStorage.write(DbStorage.java:158) 
> ~[aurora-113.jar:na]
>         at 
> org.apache.aurora.common.inject.TimedInterceptor.invoke(TimedInterceptor.java:83)
>  ~[commons-113.jar:na]
>         at 
> org.apache.aurora.scheduler.storage.log.SnapshotStoreImpl.applySnapshot(SnapshotStoreImpl.java:274)
>  ~[aurora-113.jar:na]
>         at 
> org.apache.aurora.common.inject.TimedInterceptor.invoke(TimedInterceptor.java:83)
>  ~[commons-113.jar:na]
>         at 
> org.apache.aurora.scheduler.storage.log.SnapshotStoreImpl.applySnapshot(SnapshotStoreImpl.java:63)
>  ~[aurora-113.jar:na]
>         at 
> org.apache.aurora.common.inject.TimedInterceptor.invoke(TimedInterceptor.java:83)
>  ~[commons-113.jar:na]
> ...
> {noformat}
> We blamed that to fee5943a95c4f08e148dc5f1366486a8c23d5773 and reverted it in 
> https://reviews.apache.org/r/42922/. I have been unable to reproduce it in 
> unit tests yet. Need some further investigation.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to