[ https://issues.apache.org/jira/browse/CASSANDRA-7188?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Jonathan Ellis reassigned CASSANDRA-7188: ----------------------------------------- Assignee: Michael Shuler (was: Albert P Tobey) > Wrong class type: class org.apache.cassandra.db.Column in > CounterColumn.reconcile > --------------------------------------------------------------------------------- > > Key: CASSANDRA-7188 > URL: https://issues.apache.org/jira/browse/CASSANDRA-7188 > Project: Cassandra > Issue Type: Bug > Reporter: Nicolas Lalevée > Assignee: Michael Shuler > > When migrating a cluster of 6 nodes from 1.2.11 to 2.0.7, we started to see > on the first migrated node this error: > {noformat} > ERROR [ReplicateOnWriteStage:1] 2014-05-07 11:26:59,779 CassandraDaemon.java > (line 198) Exception in thread Thread[ReplicateOnWriteStage:1,5,main] > java.lang.AssertionError: Wrong class type: class > org.apache.cassandra.db.Column > at > org.apache.cassandra.db.CounterColumn.reconcile(CounterColumn.java:159) > at > org.apache.cassandra.db.filter.QueryFilter$1.reduce(QueryFilter.java:109) > at > org.apache.cassandra.db.filter.QueryFilter$1.reduce(QueryFilter.java:103) > at > org.apache.cassandra.utils.MergeIterator$ManyToOne.consume(MergeIterator.java:112) > at > org.apache.cassandra.utils.MergeIterator$ManyToOne.computeNext(MergeIterator.java:98) > at > com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:143) > at > com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:138) > at > org.apache.cassandra.db.filter.NamesQueryFilter.collectReducedColumns(NamesQueryFilter.java:98) > at > org.apache.cassandra.db.filter.QueryFilter.collateColumns(QueryFilter.java:122) > at > org.apache.cassandra.db.filter.QueryFilter.collateOnDiskAtom(QueryFilter.java:80) > at > org.apache.cassandra.db.filter.QueryFilter.collateOnDiskAtom(QueryFilter.java:72) > at > org.apache.cassandra.db.CollationController.collectAllData(CollationController.java:297) > at > org.apache.cassandra.db.CollationController.getTopLevelColumns(CollationController.java:53) > at > org.apache.cassandra.db.ColumnFamilyStore.getTopLevelColumns(ColumnFamilyStore.java:1540) > at > org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:1369) > at org.apache.cassandra.db.Keyspace.getRow(Keyspace.java:327) > at > org.apache.cassandra.db.SliceByNamesReadCommand.getRow(SliceByNamesReadCommand.java:55) > at > org.apache.cassandra.db.CounterMutation.makeReplicationMutation(CounterMutation.java:100) > at > org.apache.cassandra.service.StorageProxy$8$1.runMayThrow(StorageProxy.java:1085) > at > org.apache.cassandra.service.StorageProxy$DroppableRunnable.run(StorageProxy.java:1916) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) > at java.lang.Thread.run(Thread.java:744) > {noformat} > We then saw on the other 5 nodes, still on 1.2.x, this error: > {noformat} > ERROR [MutationStage:2793] 2014-05-07 11:46:12,301 CassandraDaemon.java (line > 191) Exception in thread Thread[MutationStage:2793,5,main] > java.lang.AssertionError: Wrong class type: class > org.apache.cassandra.db.Column > at > org.apache.cassandra.db.CounterColumn.reconcile(CounterColumn.java:165) > at > org.apache.cassandra.db.AtomicSortedColumns$Holder.addColumn(AtomicSortedColumns.java:378) > at > org.apache.cassandra.db.AtomicSortedColumns.addColumn(AtomicSortedColumns.java:166) > at > org.apache.cassandra.db.AbstractColumnContainer.addColumn(AbstractColumnContainer.java:119) > at org.apache.cassandra.db.SuperColumn.addColumn(SuperColumn.java:218) > at org.apache.cassandra.db.SuperColumn.putColumn(SuperColumn.java:229) > at > org.apache.cassandra.db.ThreadSafeSortedColumns.addColumnInternal(ThreadSafeSortedColumns.java:108) > at > org.apache.cassandra.db.ThreadSafeSortedColumns.addAllWithSizeDelta(ThreadSafeSortedColumns.java:138) > at > org.apache.cassandra.db.AbstractColumnContainer.addAllWithSizeDelta(AbstractColumnContainer.java:99) > at org.apache.cassandra.db.Memtable.resolve(Memtable.java:205) > at org.apache.cassandra.db.Memtable.put(Memtable.java:168) > at > org.apache.cassandra.db.ColumnFamilyStore.apply(ColumnFamilyStore.java:742) > at org.apache.cassandra.db.Table.apply(Table.java:388) > at org.apache.cassandra.db.Table.apply(Table.java:353) > at org.apache.cassandra.db.RowMutation.apply(RowMutation.java:280) > at > org.apache.cassandra.db.CounterMutation.apply(CounterMutation.java:137) > at > org.apache.cassandra.service.StorageProxy$7.runMayThrow(StorageProxy.java:773) > at > org.apache.cassandra.service.StorageProxy$LocalMutationRunnable.run(StorageProxy.java:1651) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1146) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) > at java.lang.Thread.run(Thread.java:679) > {noformat} > Here some other stack we also on the 5 unmigrated nodes: > {noformat} > ERROR [ReadStage:4242] 2014-05-07 11:46:12,259 CassandraDaemon.java (line > 191) Exception in thread Thread[ReadStage:4242,5,main] > java.lang.AssertionError: Wrong class type: class > org.apache.cassandra.db.Column > at > org.apache.cassandra.db.CounterColumn.reconcile(CounterColumn.java:165) > at > org.apache.cassandra.db.AtomicSortedColumns$Holder.addColumn(AtomicSortedColumns.java:378) > at > org.apache.cassandra.db.AtomicSortedColumns.addColumn(AtomicSortedColumns.java:166) > at > org.apache.cassandra.db.AbstractColumnContainer.addColumn(AbstractColumnContainer.java:119) > at org.apache.cassandra.db.SuperColumn.addColumn(SuperColumn.java:218) > at org.apache.cassandra.db.SuperColumn.putColumn(SuperColumn.java:229) > at > org.apache.cassandra.db.ArrayBackedSortedColumns.resolveAgainst(ArrayBackedSortedColumns.java:164) > at > org.apache.cassandra.db.ArrayBackedSortedColumns.addColumn(ArrayBackedSortedColumns.java:141) > at > org.apache.cassandra.db.AbstractColumnContainer.addColumn(AbstractColumnContainer.java:119) > at > org.apache.cassandra.db.AbstractColumnContainer.addColumn(AbstractColumnContainer.java:114) > at > org.apache.cassandra.db.filter.QueryFilter$1.reduce(QueryFilter.java:112) > at > org.apache.cassandra.db.filter.QueryFilter$1.reduce(QueryFilter.java:96) > at > org.apache.cassandra.utils.MergeIterator$ManyToOne.consume(MergeIterator.java:111) > at > org.apache.cassandra.utils.MergeIterator$ManyToOne.computeNext(MergeIterator.java:97) > at > com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:143) > at > com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:138) > at > org.apache.cassandra.db.filter.NamesQueryFilter.collectReducedColumns(NamesQueryFilter.java:103) > at > org.apache.cassandra.db.filter.QueryFilter.collateColumns(QueryFilter.java:136) > at > org.apache.cassandra.db.filter.QueryFilter.collateOnDiskAtom(QueryFilter.java:84) > at > org.apache.cassandra.db.CollationController.collectAllData(CollationController.java:291) > at > org.apache.cassandra.db.CollationController.getTopLevelColumns(CollationController.java:65) > at > org.apache.cassandra.db.ColumnFamilyStore.getTopLevelColumns(ColumnFamilyStore.java:1391) > at > org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:1207) > at > org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:1123) > at org.apache.cassandra.db.Table.getRow(Table.java:347) > at > org.apache.cassandra.db.SliceFromReadCommand.getRow(SliceFromReadCommand.java:70) > at > org.apache.cassandra.db.ReadVerbHandler.doVerb(ReadVerbHandler.java:44) > at > org.apache.cassandra.net.MessageDeliveryTask.run(MessageDeliveryTask.java:56) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1146) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) > at java.lang.Thread.run(Thread.java:679) > {noformat} > And the client side, it is failing with: > {noformat} > Caused by: org.apache.cassandra.thrift.UnavailableException: null > at > org.apache.cassandra.thrift.Cassandra$get_slice_result.read(Cassandra.java:7866) > at > org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:78) > at > org.apache.cassandra.thrift.Cassandra$Client.recv_get_slice(Cassandra.java:594) > at > org.apache.cassandra.thrift.Cassandra$Client.get_slice(Cassandra.java:578) > at > me.prettyprint.cassandra.service.KeyspaceServiceImpl$7.execute(KeyspaceServiceImpl.java:274) > {noformat} > After seeing such errors, we just shut down the first migrated node, hoping > it would avoid all these client errors. But errors continue to be logged, > even if there were only the 5 1.2.x nodes in the ring. > As the usual wild guess, let's reboot a node to fix it. At our damned > surprise, it would restart and would fail with: > {noformat} > INFO 11:33:40,190 Initializing system.LocationInfo > java.lang.AssertionError > at > org.apache.cassandra.cql3.CFDefinition.<init>(CFDefinition.java:162) > at > org.apache.cassandra.config.CFMetaData.updateCfDef(CFMetaData.java:1541) > at > org.apache.cassandra.config.CFMetaData.fromSchema(CFMetaData.java:1456) > at > org.apache.cassandra.config.KSMetaData.deserializeColumnFamilies(KSMetaData.java:306) > at > org.apache.cassandra.config.KSMetaData.fromSchema(KSMetaData.java:287) > at org.apache.cassandra.db.DefsTable.loadFromTable(DefsTable.java:154) > at > org.apache.cassandra.config.DatabaseDescriptor.loadSchemas(DatabaseDescriptor.java:574) > at > org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:253) > at > org.apache.cassandra.service.CassandraDaemon.init(CassandraDaemon.java:381) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:616) > at > org.apache.commons.daemon.support.DaemonLoader.load(DaemonLoader.java:212) > Cannot load daemon > Service exit with a return value of 3 > {noformat} > From there we only had 4 running nodes, with errors spreading around. So we > halted everything, put the first node back to 1.2.11 and restored the data > which has been snapshot just before the first node was migrated. -- This message was sent by Atlassian JIRA (v6.2#6252)