[ https://issues.apache.org/jira/browse/CASSANDRA-8696?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14295852#comment-14295852 ]
Yuki Morishita commented on CASSANDRA-8696: ------------------------------------------- If there are no errors on replica nodes, then snapshot may be timed out. Check if replica nodes are busy (i.e. heavy GC activity) and check nodetool tpstats. Lost notification is harmless, it is just telling nodetool lost some messages so you should check log for repair completion. > nodetool repair on cassandra 2.1.2 keyspaces return > java.lang.RuntimeException: Could not create snapshot > --------------------------------------------------------------------------------------------------------- > > Key: CASSANDRA-8696 > URL: https://issues.apache.org/jira/browse/CASSANDRA-8696 > Project: Cassandra > Issue Type: Bug > Reporter: Jeff Liu > > When trying to run nodetool repair -pr on cassandra node ( 2.1.2), cassandra > throw java exceptions: cannot create snapshot. > the error log from system.log: > {noformat} > INFO [STREAM-IN-/10.97.9.110] 2015-01-28 02:07:28,815 > StreamResultFuture.java:166 - [Stream #692c1450-a692-11e4-9973-070e938df227 > ID#0] Prepare completed. Receiving 2 files(221187 bytes), sending 5 > files(632105 bytes) > INFO [STREAM-IN-/10.97.9.110] 2015-01-28 02:07:29,046 > StreamResultFuture.java:180 - [Stream #692c1450-a692-11e4-9973-070e938df227] > Session with /10.97.9.110 is complete > INFO [STREAM-IN-/10.97.9.110] 2015-01-28 02:07:29,046 > StreamResultFuture.java:212 - [Stream #692c1450-a692-11e4-9973-070e938df227] > All sessions completed > INFO [STREAM-IN-/10.97.9.110] 2015-01-28 02:07:29,047 > StreamingRepairTask.java:96 - [repair #685e3d00-a692-11e4-9973-070e938df227] > streaming task succeed, returning response to /10.98.194.68 > INFO [RepairJobTask:1] 2015-01-28 02:07:29,065 StreamResultFuture.java:86 - > [Stream #692c6270-a692-11e4-9973-070e938df227] Executing streaming plan for > Repair > INFO [StreamConnectionEstablisher:4] 2015-01-28 02:07:29,065 > StreamSession.java:213 - [Stream #692c6270-a692-11e4-9973-070e938df227] > Starting streaming to /10.66.187.201 > INFO [StreamConnectionEstablisher:4] 2015-01-28 02:07:29,070 > StreamCoordinator.java:209 - [Stream #692c6270-a692-11e4-9973-070e938df227, > ID#0] Beginning stream session with /10.66.187.201 > INFO [STREAM-IN-/10.66.187.201] 2015-01-28 02:07:29,465 > StreamResultFuture.java:166 - [Stream #692c6270-a692-11e4-9973-070e938df227 > ID#0] Prepare completed. Receiving 5 files(627994 bytes), sending 5 > files(632105 bytes) > INFO [StreamReceiveTask:22] 2015-01-28 02:07:31,971 > StreamResultFuture.java:180 - [Stream #692c6270-a692-11e4-9973-070e938df227] > Session with /10.66.187.201 is complete > INFO [StreamReceiveTask:22] 2015-01-28 02:07:31,972 > StreamResultFuture.java:212 - [Stream #692c6270-a692-11e4-9973-070e938df227] > All sessions completed > INFO [StreamReceiveTask:22] 2015-01-28 02:07:31,972 > StreamingRepairTask.java:96 - [repair #685e3d00-a692-11e4-9973-070e938df227] > streaming task succeed, returning response to /10.98.194.68 > ERROR [RepairJobTask:1] 2015-01-28 02:07:39,444 RepairJob.java:127 - Error > occurred during snapshot phase > java.lang.RuntimeException: Could not create snapshot at /10.97.9.110 > at > org.apache.cassandra.repair.SnapshotTask$SnapshotCallback.onFailure(SnapshotTask.java:77) > ~[apache-cassandra-2.1.2.jar:2.1.2] > at > org.apache.cassandra.net.MessagingService$5$1.run(MessagingService.java:347) > ~[apache-cassandra-2.1.2.jar:2.1.2] > at > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) > ~[na:1.7.0_45] > at java.util.concurrent.FutureTask.run(FutureTask.java:262) > ~[na:1.7.0_45] > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) > [na:1.7.0_45] > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) > [na:1.7.0_45] > at java.lang.Thread.run(Thread.java:744) [na:1.7.0_45] > INFO [AntiEntropySessions:6] 2015-01-28 02:07:39,445 RepairSession.java:260 > - [repair #6f85e740-a692-11e4-9973-070e938df227] new session: will sync > /10.98.194.68, /10.66.187.201, /10.226.218.135 on range > (12817179804668051873746972069086 > 2638799,128635403083592540777731520865977436165] for events.[bigint0text, > bigint0boolean, bigint0int, dataset_catalog, column_categories, > bigint0double, bigint0bigint] > ERROR [AntiEntropySessions:5] 2015-01-28 02:07:39,445 RepairSession.java:303 > - [repair #685e3d00-a692-11e4-9973-070e938df227] session completed with the > following error > java.io.IOException: Failed during snapshot creation. > at > org.apache.cassandra.repair.RepairSession.failedSnapshot(RepairSession.java:344) > ~[apache-cassandra-2.1.2.jar:2.1.2] > at > org.apache.cassandra.repair.RepairJob$2.onFailure(RepairJob.java:128) > ~[apache-cassandra-2.1.2.jar:2.1.2] > at com.google.common.util.concurrent.Futures$4.run(Futures.java:1172) > ~[guava-16.0.jar:na] > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) > [na:1.7.0_45] > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) > [na:1.7.0_45] > at java.lang.Thread.run(Thread.java:744) [na:1.7.0_45] > ERROR [AntiEntropySessions:5] 2015-01-28 02:07:39,446 > CassandraDaemon.java:153 - Exception in thread > Thread[AntiEntropySessions:5,5,RMI Runtime] > java.lang.RuntimeException: java.io.IOException: Failed during snapshot > creation. > at com.google.common.base.Throwables.propagate(Throwables.java:160) > ~[guava-16.0.jar:na] > at > org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:32) > ~[apache-cassandra-2.1.2.jar:2.1.2] > at > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) > ~[na:1.7.0_45] > at java.util.concurrent.FutureTask.run(FutureTask.java:262) > ~[na:1.7.0_45] > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) > ~[na:1.7.0_45] > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) > [na:1.7.0_45] > at java.lang.Thread.run(Thread.java:744) [na:1.7.0_45] > Caused by: java.io.IOException: Failed during snapshot creation. > at > org.apache.cassandra.repair.RepairSession.failedSnapshot(RepairSession.java:344) > ~[apache-cassandra-2.1.2.jar:2.1.2] > at > org.apache.cassandra.repair.RepairJob$2.onFailure(RepairJob.java:128) > ~[apache-cassandra-2.1.2.jar:2.1.2] > at com.google.common.util.concurrent.Futures$4.run(Futures.java:1172) > ~[guava-16.0.jar:na] > ... 3 common frames omitted > {noformat} > The only change we did recently was to change tablespace replication factor > from 2 to 3 before seeing those errors. Also same time we start seeing > timeout errors from application. > the timeout error is something like: > {noformat} > core.exceptions.ReadTimeoutException: Cassandra timeout during read query at > consistency ONE (1 responses were required but only 0 replica responded) > at > com.datastax.driver.core.exceptions.ReadTimeoutException.copy(ReadTimeoutException.java:69) > ~[com.datastax.cassandra.cassandra-driver-core-2.1.3.jar:na] > at > com.datastax.driver.core.Responses$Error.asException(Responses.java:100) > ~[com.datastax.cassandra.cassandra-driver-core-2.1.3.jar:na] > at > com.datastax.driver.core.DefaultResultSetFuture.onSet(DefaultResultSetFuture.java:110) > ~[com.datastax.cassandra.cassandra-driver-core-2.1.3.jar:na] > at > com.datastax.driver.core.RequestHandler.setFinalResult(RequestHandler.java:249) > ~[com.datastax.cassandra.cassandra-driver-core-2.1.3.jar:na] > at com.datastax.driver.core.RequestHandler.onSet(RequestHandler.java:433) > ~[com.datastax.cassandra.cassandra-driver-core-2.1.3.jar:na] > at > com.datastax.driver.core.Connection$Dispatcher.messageReceived(Connection.java:668) > ~[com.datastax.cassandra.cassandra-driver-core-2.1.3.jar:na] > at > org.jboss.netty.channel.SimpleChannelUpstreamHandler.handleUpstream(SimpleChannelUpstreamHandler.java:70) > ~[io.netty.netty-3.9.0.Final.jar:na] > at > org.jboss.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:564) > ~[io.netty.netty-3.9.0.Final.jar:na] > at > org.jboss.netty.channel.DefaultChannelPipeline$DefaultChannelHandlerContext.sendUpstream(DefaultChannelPipeline.java:791) > ~[io.netty.netty-3.9.0.Final.jar:na] > at > org.jboss.netty.channel.Channels.fireMessageReceived(Channels.java:296) > ~[io.netty.netty-3.9.0.Final.jar:na] > at > org.jboss.netty.handler.codec.oneone.OneToOneDecoder.handleUpstream(OneToOneDecoder.java:70) > ~[io.netty.netty-3.9.0.Final.jar:na] > at > org.jboss.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:564) > ~[io.netty.netty-3.9.0.Final.jar:na] > at > org.jboss.netty.channel.DefaultChannelPipeline$DefaultChannelHandlerContext.sendUpstream(DefaultChannelPipeline.java:791) > ~[io.netty.netty-3.9.0.Final.jar:na] > at > org.jboss.netty.channel.Channels.fireMessageReceived(Channels.java:296) > ~[io.netty.netty-3.9.0.Final.jar:na] > at > org.jboss.netty.handler.codec.frame.FrameDecoder.unfoldAndFireMessageReceived(FrameDecoder.java:462) > ~[io.netty.netty-3.9.0.Final.jar:na] > at > org.jboss.netty.handler.codec.frame.FrameDecoder.callDecode(FrameDecoder.java:443) > ~[io.netty.netty-3.9.0.Final.jar:na] > at > org.jboss.netty.handler.codec.frame.FrameDecoder.messageReceived(FrameDecoder.java:303) > ~[io.netty.netty-3.9.0.Final.jar:na] > at > org.jboss.netty.channel.SimpleChannelUpstreamHandler.handleUpstream(SimpleChannelUpstreamHandler.java:70) > ~[io.netty.netty-3.9.0.Final.jar:na] > at > org.jboss.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:564) > ~[io.netty.netty-3.9.0.Final.jar:na] > at > org.jboss.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:559) > ~[io.netty.netty-3.9.0.Final.jar:na] > at > org.jboss.netty.channel.Channels.fireMessageReceived(Channels.java:268) > ~[io.netty.netty-3.9.0.Final.jar:na] > at > org.jboss.netty.channel.Channels.fireMessageReceived(Channels.java:255) > ~[io.netty.netty-3.9.0.Final.jar:na] > at org.jboss.netty.channel.socket.nio.NioWorker.read(NioWorker.java:88) > ~[io.netty.netty-3.9.0.Final.jar:na] > at > org.jboss.netty.channel.socket.nio.AbstractNioWorker.process(AbstractNioWorker.java:108) > ~[io.netty.netty-3.9.0.Final.jar:na] > at > org.jboss.netty.channel.socket.nio.AbstractNioSelector.run(AbstractNioSelector.java:318) > ~[io.netty.netty-3.9.0.Final.jar:na] > at > org.jboss.netty.channel.socket.nio.AbstractNioWorker.run(AbstractNioWorker.java:89) > ~[io.netty.netty-3.9.0.Final.jar:na] > at org.jboss.netty.channel.socket.nio.NioWorker.run(NioWorker.java:178) > ~[io.netty.netty-3.9.0.Final.jar:na] > at > org.jboss.netty.util.ThreadRenamingRunnable.run(ThreadRenamingRunnable.java:108) > ~[io.netty.netty-3.9.0.Final.jar:na] > at > org.jboss.netty.util.internal.DeadLockProofWorker$1.run(DeadLockProofWorker.java:42) > ~[io.netty.netty-3.9.0.Final.jar:na] > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) > ~[na:1.7.0_55] > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) > ~[na:1.7.0_55] > at java.lang.Thread.run(Thread.java:745) ~[na:1.7.0_55] > Caused by: com.datastax.driver.core.exceptions.ReadTimeoutException: > Cassandra timeout during read query at consistency ONE (1 responses were > required but only 0 replica responded) > at com.datastax.driver.core.Responses$Error$1.decode(Responses.java:61) > ~[com.datastax.cassandra.cassandra-driver-core-2.1.3.jar:na] > at com.datastax.driver.core.Responses$Error$1.decode(Responses.java:38) > ~[com.datastax.cassandra.cassandra-driver-core-2.1.3.jar:na] > at > com.datastax.driver.core.Message$ProtocolDecoder.decode(Message.java:168) > ~[com.datastax.cassandra.cassandra-driver-core-2.1.3.jar:na] > at > org.jboss.netty.handler.codec.oneone.OneToOneDecoder.handleUpstream(OneToOneDecoder.java:66) > ~[io.netty.netty-3.9.0.Final.jar:na] > ... 21 common frames omitted > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)