You can't stream between major versions. Don't tear down your first data center, upgrade it instead. On Mon, Oct 10, 2016 at 4:35 PM Abhishek Verma <ve...@uber.com> wrote:
> Hi Cassandra users, > > We are trying to upgrade our Cassandra version from 2.2.5 to 3.0.8 > (running on Mesos, but that's besides the point). We have two datacenters, > so in order to preserve our data, we are trying to upgrade one datacenter > at a time. > > Initially both DCs (dc1 and dc2) are running 2.2.5. The idea is to tear > down dc1 completely (delete all the data in it), bring it up with 3.0.8, > let data replicate from dc2 to dc1, and then tear down dc2, bring it up > with 3.0.8 and replicate data from dc1. > > I am able to reproduce the problem on bare metal clusters running on 3 > nodes. I am using Oracle's server-jre-8u74-linux-x64 JRE. > > *Node A*: Downloaded 2.2.5-bin.tar.gz, changed the seeds to include its > own IP address, changed listen_address and rpc_address to its own IP and > changed endpoint_snitch to GossipingPropertyFileSnitch. I > changed conf/cassandra-rackdc.properties to > dc=dc2 > rack=rack2 > This node started up fine and is UN in nodetool status in dc2. > > I used CQL shell to create a table and insert 3 rows: > verma@xxxxx:~/apache-cassandra-2.2.5$ bin/cqlsh $HOSTNAME > Connected to Test Cluster at xxxxx:9042. > [cqlsh 5.0.1 | Cassandra 2.2.5 | CQL spec 3.3.1 | Native protocol v4] > Use HELP for help. > cqlsh> desc tmp > > CREATE KEYSPACE tmp WITH replication = {'class': > 'NetworkTopologyStrategy', 'dc1': '1', 'dc2': '1'} AND durable_writes = > true; > > CREATE TABLE tmp.map ( > key text PRIMARY KEY, > value text > )...; > cqlsh> select * from tmp.map; > > key | value > -----+------- > k1 | v1 > k3 | v3 > k2 | v2 > > > *Node B:* Downloaded 3.0.8-bin.tar.gz, changed the seeds to include > itself and node A, changed listen_address and rpc_address to its own IP, > changed endpoint_snitch to GossipingPropertyFileSnitch. I did not change > conf/cassandra-rackdc.properties and its contents are > dc=dc1 > rack=rack1 > > In the logs, I see: > INFO [main] 2016-10-10 22:42:42,850 MessagingService.java:557 - Starting > Messaging Service on /10.164.32.29:7000 (eth0) > INFO [main] 2016-10-10 22:42:42,864 StorageService.java:784 - This node > will not auto bootstrap because it is configured to be a seed node. > > So I start a third node: > *Node C:* Downloaded 3.0.8-bin.tar.gz, changed the seeds to include node > A and node B, changed listen_address and rpc_address to its own IP, changed > endpoint_snitch to GossipingPropertyFileSnitch. I did not change > conf/cassandra-rackdc.properties. > Now, nodetool status shows: > > verma@xxxxxxx:~/apache-cassandra-3.0.8$ bin/nodetool status > Datacenter: dc1 > =============== > Status=Up/Down > |/ State=Normal/Leaving/Joining/Moving > -- Address Load Tokens Owns (effective) Host ID > Rack > UJ <Node C IP> 87.81 KB 256 ? > 9064832d-ed5c-4c42-ad5a-f754b52b670c rack1 > UN <Node B IP> 107.72 KB 256 100.0% > 28b1043f-115b-46a5-b6b6-8609829cde76 rack1 > Datacenter: dc2 > =============== > Status=Up/Down > |/ State=Normal/Leaving/Joining/Moving > -- Address Load Tokens Owns (effective) Host ID > Rack > UN <Node A IP> 73.2 KB 256 100.0% > 09cc542c-2299-45a5-a4d1-159c239ded37 rack2 > > Nodetool describe cluster shows: > verma@xxxxxxx:~/apache-cassandra-3.0.8$ bin/nodetool describecluster > Cluster Information: > Name: Test Cluster > Snitch: org.apache.cassandra.locator.DynamicEndpointSnitch > Partitioner: org.apache.cassandra.dht.Murmur3Partitioner > Schema versions: > c2a2bb4f-7d31-3fb8-a216-00b41a643650: [<Node B IP>, <Node C IP>] > > 9770e3c5-3135-32e2-b761-65a0f6d8824e: [<Node A IP>] > > Note that there are two schema versions and they don't match. > > I see the following in the system.log: > > INFO [InternalResponseStage:1] 2016-10-10 22:48:36,055 > ColumnFamilyStore.java:390 - Initializing system_auth.roles > INFO [main] 2016-10-10 22:48:36,316 StorageService.java:1149 - JOINING: > waiting for schema information to complete > INFO [main] 2016-10-10 22:48:36,316 StorageService.java:1149 - JOINING: > schema complete, ready to bootstrap > INFO [main] 2016-10-10 22:48:36,316 StorageService.java:1149 - JOINING: > waiting for pending range calculation > INFO [main] 2016-10-10 22:48:36,317 StorageService.java:1149 - JOINING: > calculation complete, ready to bootstrap > INFO [main] 2016-10-10 22:48:36,319 StorageService.java:1149 - JOINING: > getting bootstrap token > INFO [main] 2016-10-10 22:48:36,357 StorageService.java:1149 - JOINING: > sleeping 30000 ms for pending range setup > INFO [main] 2016-10-10 22:49:06,358 StorageService.java:1149 - JOINING: > Starting to bootstrap... > INFO [main] 2016-10-10 22:49:06,494 StreamResultFuture.java:87 - [Stream > #bfb5e470-8f3b-11e6-b69a-1b451159408e] Executing streaming plan for > Bootstrap > INFO [StreamConnectionEstablisher:1] 2016-10-10 22:49:06,495 > StreamSession.java:242 - [Stream #bfb5e470-8f3b-11e6-b69a-1b451159408e] > Starting streaming to /<Node A IP> > INFO [StreamConnectionEstablisher:2] 2016-10-10 22:49:06,495 > StreamSession.java:242 - [Stream #bfb5e470-8f3b-11e6-b69a-1b451159408e] > Starting streaming to /<Node B IP> > INFO [StreamConnectionEstablisher:2] 2016-10-10 22:49:06,500 > StreamCoordinator.java:213 - [Stream #bfb5e470-8f3b-11e6-b69a-1b451159408e, > ID#0] Beginning stream session with /<Node B IP> > INFO [STREAM-IN-/<Node B IP>] 2016-10-10 22:49:06,590 > StreamResultFuture.java:183 - [Stream > #bfb5e470-8f3b-11e6-b69a-1b451159408e] Session with /<Node B IP> is complete > INFO [StreamConnectionEstablisher:1] 2016-10-10 22:49:06,635 > StreamCoordinator.java:213 - [Stream #bfb5e470-8f3b-11e6-b69a-1b451159408e, > ID#0] Beginning stream session with /<Node A IP> > ERROR [STREAM-IN-/<Node A IP>] 2016-10-10 22:49:06,639 > StreamSession.java:528 - [Stream #bfb5e470-8f3b-11e6-b69a-1b451159408e] > Streaming error occurred > java.io.IOException: Connection reset by peer > at sun.nio.ch.FileDispatcherImpl.read0(Native Method) ~[na:1.8.0_102] > at sun.nio.ch.SocketDispatcher.read(SocketDispatcher.java:39) > ~[na:1.8.0_102] > at sun.nio.ch.IOUtil.readIntoNativeBuffer(IOUtil.java:223) ~[na:1.8.0_102] > at sun.nio.ch.IOUtil.read(IOUtil.java:197) ~[na:1.8.0_102] > at sun.nio.ch.SocketChannelImpl.read(SocketChannelImpl.java:380) > ~[na:1.8.0_102] > at sun.nio.ch.SocketAdaptor$SocketInputStream.read(SocketAdaptor.java:206) > ~[na:1.8.0_102] > at sun.nio.ch.ChannelInputStream.read(ChannelInputStream.java:103) > ~[na:1.8.0_102] > at > java.nio.channels.Channels$ReadableByteChannelImpl.read(Channels.java:385) > ~[na:1.8.0_102] > at > org.apache.cassandra.streaming.messages.StreamMessage.deserialize(StreamMessage.java:54) > ~[apache-cassandra-3.0.8.jar:3.0.8] > at > org.apache.cassandra.streaming.ConnectionHandler$IncomingMessageHandler.run(ConnectionHandler.java:287) > ~[apache-cassandra-3.0.8.jar:3.0.8] > at java.lang.Thread.run(Thread.java:745) [na:1.8.0_102] > INFO [STREAM-IN-/<Node A IP>] 2016-10-10 22:49:06,639 > StreamResultFuture.java:183 - [Stream > #bfb5e470-8f3b-11e6-b69a-1b451159408e] Session with /<Node A IP> is complete > WARN [STREAM-IN-/<Node A IP>] 2016-10-10 22:49:06,640 > StreamResultFuture.java:210 - [Stream > #bfb5e470-8f3b-11e6-b69a-1b451159408e] Stream failed > WARN [STREAM-IN-/<Node A IP>] 2016-10-10 22:49:06,640 > StorageService.java:1208 - Error during bootstrap. > org.apache.cassandra.streaming.StreamException: Stream failed > at > org.apache.cassandra.streaming.management.StreamEventJMXNotifier.onFailure(StreamEventJMXNotifier.java:85) > ~[apache-cassandra-3.0.8.jar:3.0.8] > at com.google.common.util.concurrent.Futures$6.run(Futures.java:1310) > [guava-18.0.jar:na] > at > com.google.common.util.concurrent.MoreExecutors$DirectExecutor.execute(MoreExecutors.java:457) > [guava-18.0.jar:na] > at > com.google.common.util.concurrent.ExecutionList.executeListener(ExecutionList.java:156) > [guava-18.0.jar:na] > at > com.google.common.util.concurrent.ExecutionList.execute(ExecutionList.java:145) > [guava-18.0.jar:na] > at > com.google.common.util.concurrent.AbstractFuture.setException(AbstractFuture.java:202) > [guava-18.0.jar:na] > at > org.apache.cassandra.streaming.StreamResultFuture.maybeComplete(StreamResultFuture.java:211) > [apache-cassandra-3.0.8.jar:3.0.8] > at > org.apache.cassandra.streaming.StreamResultFuture.handleSessionComplete(StreamResultFuture.java:187) > [apache-cassandra-3.0.8.jar:3.0.8] > at > org.apache.cassandra.streaming.StreamSession.closeSession(StreamSession.java:429) > [apache-cassandra-3.0.8.jar:3.0.8] > at > org.apache.cassandra.streaming.StreamSession.onError(StreamSession.java:534) > [apache-cassandra-3.0.8.jar:3.0.8] > at > org.apache.cassandra.streaming.ConnectionHandler$IncomingMessageHandler.run(ConnectionHandler.java:305) > [apache-cassandra-3.0.8.jar:3.0.8] > at java.lang.Thread.run(Thread.java:745) [na:1.8.0_102] > ERROR [main] 2016-10-10 22:49:06,641 StorageService.java:1218 - Error > while waiting on bootstrap to complete. Bootstrap will have to be restarted. > java.util.concurrent.ExecutionException: > org.apache.cassandra.streaming.StreamException: Stream failed > at > com.google.common.util.concurrent.AbstractFuture$Sync.getValue(AbstractFuture.java:299) > ~[guava-18.0.jar:na] > at > com.google.common.util.concurrent.AbstractFuture$Sync.get(AbstractFuture.java:286) > ~[guava-18.0.jar:na] > at > com.google.common.util.concurrent.AbstractFuture.get(AbstractFuture.java:116) > ~[guava-18.0.jar:na] > at > org.apache.cassandra.service.StorageService.bootstrap(StorageService.java:1213) > [apache-cassandra-3.0.8.jar:3.0.8] > at > org.apache.cassandra.service.StorageService.joinTokenRing(StorageService.java:889) > [apache-cassandra-3.0.8.jar:3.0.8] > at > org.apache.cassandra.service.StorageService.initServer(StorageService.java:663) > [apache-cassandra-3.0.8.jar:3.0.8] > at > org.apache.cassandra.service.StorageService.initServer(StorageService.java:528) > [apache-cassandra-3.0.8.jar:3.0.8] > at > org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:339) > [apache-cassandra-3.0.8.jar:3.0.8] > at > org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:557) > [apache-cassandra-3.0.8.jar:3.0.8] > at > org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:685) > [apache-cassandra-3.0.8.jar:3.0.8] > Caused by: org.apache.cassandra.streaming.StreamException: Stream failed > at > org.apache.cassandra.streaming.management.StreamEventJMXNotifier.onFailure(StreamEventJMXNotifier.java:85) > ~[apache-cassandra-3.0.8.jar:3.0.8] > at com.google.common.util.concurrent.Futures$6.run(Futures.java:1310) > ~[guava-18.0.jar:na] > at > com.google.common.util.concurrent.MoreExecutors$DirectExecutor.execute(MoreExecutors.java:457) > ~[guava-18.0.jar:na] > at > com.google.common.util.concurrent.ExecutionList.executeListener(ExecutionList.java:156) > ~[guava-18.0.jar:na] > at > com.google.common.util.concurrent.ExecutionList.execute(ExecutionList.java:145) > ~[guava-18.0.jar:na] > at > com.google.common.util.concurrent.AbstractFuture.setException(AbstractFuture.java:202) > ~[guava-18.0.jar:na] > at > org.apache.cassandra.streaming.StreamResultFuture.maybeComplete(StreamResultFuture.java:211) > ~[apache-cassandra-3.0.8.jar:3.0.8] > at > org.apache.cassandra.streaming.StreamResultFuture.handleSessionComplete(StreamResultFuture.java:187) > ~[apache-cassandra-3.0.8.jar:3.0.8] > at > org.apache.cassandra.streaming.StreamSession.closeSession(StreamSession.java:429) > ~[apache-cassandra-3.0.8.jar:3.0.8] > at > org.apache.cassandra.streaming.StreamSession.onError(StreamSession.java:534) > ~[apache-cassandra-3.0.8.jar:3.0.8] > at > org.apache.cassandra.streaming.ConnectionHandler$IncomingMessageHandler.run(ConnectionHandler.java:305) > ~[apache-cassandra-3.0.8.jar:3.0.8] > at java.lang.Thread.run(Thread.java:745) ~[na:1.8.0_102] > WARN [main] 2016-10-10 22:49:06,646 StorageService.java:944 - Some data > streaming failed. Use nodetool to check bootstrap state and resume. For > more, see `nodetool help bootstrap`. IN_PROGRESS > INFO [main] 2016-10-10 22:49:06,647 CassandraDaemon.java:644 - Waiting > for gossip to settle before accepting client requests... > INFO [main] 2016-10-10 22:49:14,648 CassandraDaemon.java:675 - No gossip > backlog; proceeding > INFO [main] 2016-10-10 22:49:14,694 NativeTransportService.java:70 - > Netty using native Epoll event loop > INFO [main] 2016-10-10 22:49:14,726 Server.java:159 - Using Netty > Version: [netty-buffer=netty-buffer-4.0.23.Final.208198c, > netty-codec=netty-codec-4.0.23.Final.208198c, > netty-codec-http=netty-codec-http-4.0.23.Final.208198c, > netty-codec-socks=netty-codec-socks-4.0.23.Final.208198c, > netty-common=netty-common-4.0.23.Final.208198c, > netty-handler=netty-handler-4.0.23.Final.208198c, > netty-transport=netty-transport-4.0.23.Final.208198c, > netty-transport-rxtx=netty-transport-rxtx-4.0.23.Final.208198c, > netty-transport-sctp=netty-transport-sctp-4.0.23.Final.208198c, > netty-transport-udt=netty-transport-udt-4.0.23.Final.208198c] > INFO [main] 2016-10-10 22:49:14,726 Server.java:160 - Starting listening > for CQL clients on /<Node C IP>:9042 (unencrypted)... > INFO [main] 2016-10-10 22:49:14,748 CassandraDaemon.java:477 - Not > starting RPC server as requested. Use JMX > (StorageService->startRPCServer()) or nodetool (enablethrift) to start it > > I tried resuming bootstrap but it fails with the same streaming errors: > > verma@<Node C>:~/apache-cassandra-3.0.8$ bin/nodetool bootstrap resume > Resuming bootstrap > [2016-10-10 23:15:11,816] session with /<Node B IP> complete (progress: 0%) > [2016-10-10 23:15:11,939] session with /<Node A IP> complete (progress: 0%) > [2016-10-10 23:15:11,940] Stream failed > > and I see the same error in the system.log: > > StreamSession.java:528 - [Stream #64b73a20-8f3f-11e6-b69a-1b451159408e] > Streaming error occurred > java.io.IOException: Connection reset by peer > ... > > Does Cassandra support upgrading from 2.2.5 to 3.0.8 in this way? Am I > missing something? > > Thanks for your time. > -Abhishek. >