Hi,

On our test cluster, we tried a upgrade of Cassandra from 1.22.1 to 2.0.6. It 
was not straight forward so I would like to know if it is expected, so I can do 
it safely on prod.

The first time we tried, the first upgrading node refused to start with this 
error:

ERROR [main] 2014-03-19 10:50:31,363 CassandraDaemon.java (line 488) Exception 
encountered during startup
java.lang.RuntimeException: Incompatible SSTable found.  Current version jb is 
unable to read file: /var/lib/cassandra/d
ata/system/NodeIdInfo/system-NodeIdInfo-hf-4.  Please run upgradesstables.
        at 
org.apache.cassandra.db.ColumnFamilyStore.createColumnFamilyStore(ColumnFamilyStore.java:415)
        at 
org.apache.cassandra.db.ColumnFamilyStore.createColumnFamilyStore(ColumnFamilyStore.java:392)
        at org.apache.cassandra.db.Keyspace.initCf(Keyspace.java:309)
        at org.apache.cassandra.db.Keyspace.<init>(Keyspace.java:266)
        at org.apache.cassandra.db.Keyspace.open(Keyspace.java:110)
        at org.apache.cassandra.db.Keyspace.open(Keyspace.java:88)
        at 
org.apache.cassandra.db.SystemKeyspace.checkHealth(SystemKeyspace.java:514)
        at 
org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:237)
        at 
org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:471)
        at 
org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:560)

I've read again the NEWS.txt [1], and as far as I understand, upgradesstables 
is only required for < 1.2.9. But maybe I don't understand correctly the 
paragraph:
    - Upgrading is ONLY supported from Cassandra 1.2.9 or later. This
      goes for sstable compatibility as well as network.  When
      upgrading from an earlier release, upgrade to 1.2.9 first and
      run upgradesstables before proceeding to 2.0.

So we did the required upgradesstables. The node started successfully.

I have checked on our prod cluster, there is also some hf files, on all nodes, 
all being /var/lib/cassandra/data/system/Versions/system-Versions-hf-*
And I have tried many upgradesstables command, there are still lying there.
# nodetool upgradesstables system Versions
Exception in thread "main" java.lang.IllegalArgumentException: Unknown table/cf 
pair (system.Versions)
# nodetool upgradesstables system
# nodetool upgradesstables
# nodetool upgradesstables -a system
# ls /var/lib/cassandra/data/system/Versions/*-hf-* | wc -l
15

I did not try "nodetool upgradesstables -a" since we have a lot of data.

I guess this will cause me trouble if I try to upgrade in prod ? Is there a bug 
I should report ?

Continuing on our test cluster, we upgraded the second node. And during the 
time we were running with 2 different versions of cassandra, there was errors 
in the logs:

ERROR [WRITE-/10.10.0.41] 2014-03-19 11:23:27,523 OutboundTcpConnection.java 
(line 234) error writing to /10.10.0.41
java.lang.RuntimeException: Cannot convert filter to old super column format. 
Update all nodes to Cassandra 2.0 first.
        at 
org.apache.cassandra.db.SuperColumns.sliceFilterToSC(SuperColumns.java:357)
        at 
org.apache.cassandra.db.SuperColumns.filterToSC(SuperColumns.java:258)
        at 
org.apache.cassandra.db.ReadCommandSerializer.serializedSize(ReadCommand.java:192)
        at 
org.apache.cassandra.db.ReadCommandSerializer.serializedSize(ReadCommand.java:134)
        at org.apache.cassandra.net.MessageOut.serialize(MessageOut.java:116)
        at 
org.apache.cassandra.net.OutboundTcpConnection.writeInternal(OutboundTcpConnection.java:251)
        at 
org.apache.cassandra.net.OutboundTcpConnection.writeConnected(OutboundTcpConnection.java:203)
        at 
org.apache.cassandra.net.OutboundTcpConnection.run(OutboundTcpConnection.java:151)

I confirm we do have old style super columns which were designed when cassandra 
was 1.0.x. Since in our test cluster the replication factor is 1, I can see 
errors on the client side, since 1 node among 2 was down. So I don't know for 
sure if this error in cassandra affected the client, the time frame is too 
short to be sure from the logs. In prod we have a replication factor of 3. If 
we'll do a such upgrade in prod, node by node to avoid any downtime, will the 
client still see write errors during the time there will be mixed versions of 
cassandra ?

Nicolas

[1] 
https://git-wip-us.apache.org/repos/asf?p=cassandra.git;a=blob_plain;f=NEWS.txt;hb=refs/tags/cassandra-2.0.6

Reply via email to