Hi -

Can anyone help me with some Cas data migration issues I'm having?

I'm attempting to migrate a dev ring (5 nodes) to a larger production one
(6 nodes). Both are hosted on EC2.

Cluster Info:

Small: Cas v.1.2.6, Rep Factor 1, vnodes enabled
Larger: Cas v.1.2.9, Rep Factor 3, vnodes enabled

After a lot of reading round I decided to use sstableloader:

http://www.palominodb.com/blog/2012/09/25/bulk-loading-options-cassandra

http://www.datastax.com/dev/blog/bulk-loading

http://geekswithblogs.net/johnsPerfBlog/archive/2011/07/26/how-to-use-cassandrs-sstableloader.aspx

I have a script which exports the target SS tables from each of the nodes
in the smaller ring and moves them to an instance which can connect to the
larger ring and run sstableloader. The script does the following on each
node

nodetool flush
nodetool compact
nodetool scrub
scp to node specific directory on the target

I've set up the sstable loader machine with Cas 1.2.9 and configured it to
bind to it's static IP (listen_address, broadcast_address, rpc_address) and
added a seed node from my target ring into the seed config. The security
groups of the loader and the ring are both open to each other (all ports).

Between tests I'm deleting the contnet of /mnt/cassandra/* and then
creating a fresh schema (matching the smaller rings schema).

I'm getting some errors when attempting to load data from the sstableloader
instance.

On the sstableloader instance:

ERROR 22:03:30,409 Error in ThreadPoolExecutor
java.lang.RuntimeException: java.io.IOException: Connection reset by peer
at com.google.common.base.Throwables.propagate(Throwables.java:160)
at org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:32)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:724)
Caused by: java.io.IOException: Connection reset by peer
at sun.nio.ch.FileDispatcherImpl.read0(Native Method)
at sun.nio.ch.SocketDispatcher.read(SocketDispatcher.java:39)
at sun.nio.ch.IOUtil.readIntoNativeBuffer(IOUtil.java:225)
at sun.nio.ch.IOUtil.read(IOUtil.java:198)
at sun.nio.ch.SocketChannelImpl.read(SocketChannelImpl.java:375)
at sun.nio.ch.SocketAdaptor$SocketInputStream.read(SocketAdaptor.java:201)
at sun.nio.ch.ChannelInputStream.read(ChannelInputStream.java:103)
at java.io.InputStream.read(InputStream.java:101)
at sun.nio.ch.ChannelInputStream.read(ChannelInputStream.java:81)
at java.io.DataInputStream.readInt(DataInputStream.java:388)
at org.apache.cassandra.net.MessageIn.read(MessageIn.java:60)
at
org.apache.cassandra.streaming.FileStreamTask.receiveReply(FileStreamTask.java:197)
at
org.apache.cassandra.streaming.FileStreamTask.stream(FileStreamTask.java:180)
at
org.apache.cassandra.streaming.FileStreamTask.runMayThrow(FileStreamTask.java:91)
at org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28)
... 3 more
Exception in thread "Streaming to /10.xxx.xxx.161:1"
java.lang.RuntimeException: java.io.IOException: Connection reset by peer
at com.google.common.base.Throwables.propagate(Throwables.java:160)
at org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:32)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:724)
Caused by: java.io.IOException: Connection reset by peer
at sun.nio.ch.FileDispatcherImpl.read0(Native Method)
at sun.nio.ch.SocketDispatcher.read(SocketDispatcher.java:39)
at sun.nio.ch.IOUtil.readIntoNativeBuffer(IOUtil.java:225)
at sun.nio.ch.IOUtil.read(IOUtil.java:198)
at sun.nio.ch.SocketChannelImpl.read(SocketChannelImpl.java:375)
at sun.nio.ch.SocketAdaptor$SocketInputStream.read(SocketAdaptor.java:201)
at sun.nio.ch.ChannelInputStream.read(ChannelInputStream.java:103)
at java.io.InputStream.read(InputStream.java:101)
at sun.nio.ch.ChannelInputStream.read(ChannelInputStream.java:81)
at java.io.DataInputStream.readInt(DataInputStream.java:388)
at org.apache.cassandra.net.MessageIn.read(MessageIn.java:60)
at
org.apache.cassandra.streaming.FileStreamTask.receiveReply(FileStreamTask.java:197)
at
org.apache.cassandra.streaming.FileStreamTask.stream(FileStreamTask.java:180)
at
org.apache.cassandra.streaming.FileStreamTask.runMayThrow(FileStreamTask.java:91)
at org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28)
... 3 more

And then the progress bar gets stuck like this:

progress: [/10..xxx.xxx.199 1/1 (100)] [/10.xxx.xxx.75 1/1 (100)]
[/10.xxx.xxx.228 1/1 (100)] [/10.xxx.xxx.243 1/1 (100)] [/10.xxx.xxx.46 1/1
(100)] [/10.xxx.xxx.161 0/1 (300)] [total: 149 - 0MB/s (avg: 0MB/s)]


And on the instances in the ring:

(node which accepts the request)

DEBUG [Thrift:5] 2013-09-08 22:03:30,027 CustomTThreadPoolServer.java (line
209) Thrift transport error occurred during processing of message.
org.apache.thrift.transport.TTransportException
at
org.apache.thrift.transport.TIOStreamTransport.read(TIOStreamTransport.java:132)
at org.apache.thrift.transport.TTransport.readAll(TTransport.java:84)
at
org.apache.thrift.transport.TFramedTransport.readFrame(TFramedTransport.java:129)
at
org.apache.thrift.transport.TFramedTransport.read(TFramedTransport.java:101)
at org.apache.thrift.transport.TTransport.readAll(TTransport.java:84)
at
org.apache.thrift.protocol.TBinaryProtocol.readAll(TBinaryProtocol.java:378)
at
org.apache.thrift.protocol.TBinaryProtocol.readI32(TBinaryProtocol.java:297)
at
org.apache.thrift.protocol.TBinaryProtocol.readMessageBegin(TBinaryProtocol.java:204)
at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:22)
at
org.apache.cassandra.thrift.CustomTThreadPoolServer$WorkerProcess.run(CustomTThreadPoolServer.java:199)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:724)

(another node in the cluster)

IOException reading from socket; closing
java.io.IOException: Corrupt (negative) value length encountered
at
org.apache.cassandra.utils.ByteBufferUtil.readWithLength(ByteBufferUtil.java:352)
at
org.apache.cassandra.db.ColumnSerializer.deserializeColumnBody(ColumnSerializer.java:108)
at
org.apache.cassandra.db.OnDiskAtom$Serializer.deserializeFromSSTable(OnDiskAtom.java:92)
at
org.apache.cassandra.io.sstable.SSTableWriter.appendFromStream(SSTableWriter.java:250)
at
org.apache.cassandra.streaming.IncomingStreamReader.streamIn(IncomingStreamReader.java:185)
at
org.apache.cassandra.streaming.IncomingStreamReader.read(IncomingStreamReader.java:122)
at
org.apache.cassandra.net.IncomingTcpConnection.stream(IncomingTcpConnection.java:238)
at
org.apache.cassandra.net.IncomingTcpConnection.handleStream(IncomingTcpConnection.java:178)
at
org.apache.cassandra.net.IncomingTcpConnection.run(IncomingTcpConnection.java:78)


I'm pretty stuck, the process is pretty opaque to me. If anyone could help
me out I'd really appreciate it!

Reply via email to