@Alain That was one of my teammate , very sorry for it/multiple threads. *It looks like streams are failing right away when trying to rebuild.?* No , after partial streaming of data (around 150 GB - we have around 600 GB of data on each node) streaming is getting failed with the above exception stack trace.
*It should be ran from DC3 servers, after altering keyspace to add keyspaces to the new datacenter. Is this the way you're doing it?* Yes, I'm running it from DC3 using " nodetool rebuild 'DC1' " command , after altering keyspace with RF : DC1:3 , DC2:3 , DC3:3 and we using Network Topology Strategy. Yes , all nodes are running on same c*-2.0.17 version. As I said , 'streaming_socket_timeout_in_ms: 86400000' to 24 hours. As suggested in @Paul & in some blogs , we gonna re-try with following changes *on new nodes in DC3.* *net.ipv4.tcp_keepalive_time=60 net.ipv4.tcp_keepalive_probes=3 net.ipv4.tcp_keepalive_intvl=10* Hope these settings are enough on new nodes from where we are going to initiate rebuild/streaming and NOT required on all existing nodes from where we are getting data streamed. Am I right ?? Have to see whether it works :( and btw ,you can please through a light on this if you have faced such exception in past. As I mentioned in my last mail, this is the exception we are getting in streaming AFTER STREAMING some data. *java.io.IOException: Connection timed out* * at sun.nio.ch.FileDispatcherImpl.write0(Native Method)* * at sun.nio.ch.SocketDispatcher.write(SocketDispatcher.java:47)* * at sun.nio.ch.IOUtil.writeFromNativeBuffer(IOUtil.java:93)* * at sun.nio.ch.IOUtil.write(IOUtil.java:65)* * at sun.nio.ch.SocketChannelImpl.write(SocketChannelImpl.java:487)* * at org.apache.cassandra.streaming.messages.StreamMessage.serialize(StreamMessage.java:44)* * at org.apache.cassandra.streaming.ConnectionHandler$OutgoingMessageHandler.sendMessage(ConnectionHandler.java:339)* * at org.apache.cassandra.streaming.ConnectionHandler$OutgoingMessageHandler.run(ConnectionHandler.java:311)* * at java.lang.Thread.run(Thread.java:745)* * INFO [STREAM-OUT-/xxx.xxx.198.191] 2016-09-27 00:28:10,347 StreamResultFuture.java (line 186) [Stream #30852870-8472-11e6-b043-3f260c696828] Session with /xxx.xxx.198.191 is complete* *ERROR [STREAM-OUT-/xxx.xxx.198.191] 2016-09-27 00:28:10,347 StreamSession.java (line 461) [Stream #30852870-8472-11e6-b043-3f260c696828] Streaming error occurred* *java.io.IOException: Broken pipe* * at sun.nio.ch.FileDispatcherImpl.write0(Native Method)* * at sun.nio.ch.SocketDispatcher.write(SocketDispatcher.java:47)* * at sun.nio.ch.IOUtil.writeFromNativeBuffer(IOUtil.java:93)* * at sun.nio.ch.IOUtil.write(IOUtil.java:65)* * at sun.nio.ch.SocketChannelImpl.write(SocketChannelImpl.java:487)* * at org.apache.cassandra.streaming.messages.StreamMessage.serialize(StreamMessage.java:44)* * at org.apache.cassandra.streaming.ConnectionHandler$OutgoingMessageHandler.sendMessage(ConnectionHandler.java:339)* * at org.apache.cassandra.streaming.ConnectionHandler$OutgoingMessageHandler.run(ConnectionHandler.java:319)* * at java.lang.Thread.run(Thread.java:745)* *ERROR [STREAM-IN-/xxx.xxx.198.191] 2016-09-27 00:28:10,461 StreamSession.java (line 461) [Stream #30852870-8472-11e6-b043-3f260c696828] Streaming error occurred* *java.lang.RuntimeException: Outgoing stream handler has been closed* * at org.apache.cassandra.streaming.ConnectionHandler.sendMessage(ConnectionHandler.java:126)* * at org.apache.cassandra.streaming.StreamSession.receive(StreamSession.java:524)* * at org.apache.cassandra.streaming.StreamSession.messageReceived(StreamSession.java:413)* * at org.apache.cassandra.streaming.ConnectionHandler$IncomingMessageHandler.run(ConnectionHandler.java:245)* * at java.lang.Thread.run(Thread.java:745)* Thanks in advance techpyaasa On Wed, Sep 28, 2016 at 6:09 PM, Alain RODRIGUEZ <arodr...@gmail.com> wrote: > Just saw a very similar question from Laxmikanth (laxmikanth...@gmail.com) > on an other thread, with the same logs. > > Would you mind to avoid splitting multiple threads, to gather up > informations so we can better help you from this mailing list? > > C*heers, > > > 2016-09-28 14:28 GMT+02:00 Alain RODRIGUEZ <arodr...@gmail.com>: > >> Hi, >> >> It looks like streams are failing right away when trying to rebuild. >> >> >> - Could you please share with us the command you used? >> >> >> It should be ran from DC3 servers, after altering keyspace to add >> keyspaces to the new datacenter. Is this the way you're doing it? >> >> - Are all the nodes using the same version ('nodetool version')? >> - What does 'nodetool status keyspace_name1' output? >> - Are you sure to be using Network Topology Strategy on '*keyspace_name1'? >> *Have you modified this schema to add replications on DC3 >> >> My guess is something could be wrong with the configuration. >> >> I checked with our network operations team , they have confirmed network >>> is stable and no network hiccups. >>> I have set 'streaming_socket_timeout_in_ms: 86400000' (24 hours) as >>> suggested in datastax blog - https://support.datastax.com >>> /hc/en-us/articles/206502913-FAQ-How-to-reduce-the-impact-of >>> -streaming-errors-or-failures and ran 'nodetool rebuild' one node at a >>> time but was of NO USE . Still we are getting above exception. >>> >> >> This look correct to me, good you added this information, thanks. >> >> An other thought is I believe you need all the nodes to be up to have >> those streams working on the origin DC you use for your 'nodetool rebuild >> <origin_dc>' command. >> >> This look a bit weird, good luck. >> >> C*heers, >> ----------------------- >> Alain Rodriguez - @arodream - al...@thelastpickle.com >> France >> >> The Last Pickle - Apache Cassandra Consulting >> http://www.thelastpickle.com >> >> >> 2016-09-27 18:54 GMT+02:00 techpyaasa . <techpya...@gmail.com>: >> >>> Hi, >>> >>> I'm trying to add new data center - DC3 to existing c*-2.0.17 cluster >>> with 2 data centers DC1, DC2 with replication DC1:3 , DC2:3 , DC3:3. >>> >>> I'm getting following exception repeatedly on new nodes after I run >>> 'nodetool rebuild'. >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> *DEBUG [ScheduledTasks:1] 2016-09-27 04:24:00,416 GCInspector.java (line >>> 118) GC for ParNew: 20 ms for 1 collections, 9837479688 used; max is >>> 16760438784DEBUG [ScheduledTasks:1] 2016-09-27 04:24:03,417 >>> GCInspector.java (line 118) GC for ParNew: 20 ms for 1 collections, >>> 9871193904 used; max is 16760438784DEBUG [ScheduledTasks:1] 2016-09-27 >>> 04:24:06,418 GCInspector.java (line 118) GC for ParNew: 20 ms for 1 >>> collections, 9950298136 used; max is 16760438784DEBUG [ScheduledTasks:1] >>> 2016-09-27 04:24:09,419 GCInspector.java (line 118) GC for ParNew: 19 ms >>> for 1 collections, 9941119568 used; max is 16760438784DEBUG >>> [ScheduledTasks:1] 2016-09-27 04:24:12,421 GCInspector.java (line 118) GC >>> for ParNew: 20 ms for 1 collections, 9864185024 used; max is >>> 16760438784DEBUG [ScheduledTasks:1] 2016-09-27 04:24:15,422 >>> GCInspector.java (line 118) GC for ParNew: 60 ms for 2 collections, >>> 9730374352 used; max is 16760438784DEBUG [ScheduledTasks:1] 2016-09-27 >>> 04:24:18,423 GCInspector.java (line 118) GC for ParNew: 18 ms for 1 >>> collections, 9775448168 used; max is 16760438784DEBUG [ScheduledTasks:1] >>> 2016-09-27 04:24:21,424 GCInspector.java (line 118) GC for ParNew: 22 ms >>> for 1 collections, 9850794272 used; max is 16760438784DEBUG >>> [ScheduledTasks:1] 2016-09-27 04:24:24,425 GCInspector.java (line 118) GC >>> for ParNew: 20 ms for 1 collections, 9729992448 <9729992448> used; max is >>> 16760438784DEBUG [ScheduledTasks:1] 2016-09-27 04:24:27,426 >>> GCInspector.java (line 118) GC for ParNew: 22 ms for 1 collections, >>> 9699783920 used; max is 16760438784DEBUG [ScheduledTasks:1] 2016-09-27 >>> 04:24:30,427 GCInspector.java (line 118) GC for ParNew: 21 ms for 1 >>> collections, 9696523920 used; max is 16760438784DEBUG [ScheduledTasks:1] >>> 2016-09-27 04:24:33,429 GCInspector.java (line 118) GC for ParNew: 20 ms >>> for 1 collections, 9560497904 used; max is 16760438784DEBUG >>> [ScheduledTasks:1] 2016-09-27 04:24:36,430 GCInspector.java (line 118) GC >>> for ParNew: 19 ms for 1 collections, 9568718352 <9568718352> used; max is >>> 16760438784DEBUG [ScheduledTasks:1] 2016-09-27 04:24:39,431 >>> GCInspector.java (line 118) GC for ParNew: 22 ms for 1 collections, >>> 9496991384 <9496991384> used; max is 16760438784DEBUG [ScheduledTasks:1] >>> 2016-09-27 04:24:42,432 GCInspector.java (line 118) GC for ParNew: 19 ms >>> for 1 collections, 9486433840 used; max is 16760438784DEBUG >>> [ScheduledTasks:1] 2016-09-27 04:24:45,434 GCInspector.java (line 118) GC >>> for ParNew: 19 ms for 1 collections, 9442642688 used; max is >>> 16760438784DEBUG [ScheduledTasks:1] 2016-09-27 04:24:48,435 >>> GCInspector.java (line 118) GC for ParNew: 20 ms for 1 collections, >>> 9548532008 <9548532008> used; max is 16760438784DEBUG >>> [STREAM-IN-/xxx.xxx.98.168] 2016-09-27 04:24:49,756 ConnectionHandler.java >>> (line 244) [Stream #5e1b7f40-8496-11e6-8847-1b88665e430d] Received File >>> (Header (cfId: bf446a90-71c5-3552-a2e5-b1b94dbf86e3, #0, version: jb, >>> estimated keys: 252928, transfer size: 5496759656, compressed?: true), >>> file: >>> /home/cassandra/data_directories/data/keyspace_name1/columnfamily_1/keyspace_name1-columnfamily_1-tmp-jb-54-Data.db)DEBUG >>> [STREAM-OUT-/xxx.xxx.98.168] 2016-09-27 04:24:49,757 ConnectionHandler.java >>> (line 310) [Stream #5e1b7f40-8496-11e6-8847-1b88665e430d] Sending Received >>> (bf446a90-71c5-3552-a2e5-b1b94dbf86e3, #0)ERROR >>> [STREAM-OUT-/xxx.xxx.98.168] 2016-09-27 04:24:49,759 StreamSession.java >>> (line 461) [Stream #5e1b7f40-8496-11e6-8847-1b88665e430d] Streaming error >>> occurredjava.io.IOException: Connection timed out at >>> sun.nio.ch.FileDispatcherImpl.write0(Native Method) at >>> sun.nio.ch.SocketDispatcher.write(SocketDispatcher.java:47) at >>> sun.nio.ch.IOUtil.writeFromNativeBuffer(IOUtil.java:93) at >>> sun.nio.ch.IOUtil.write(IOUtil.java:65) at >>> sun.nio.ch.SocketChannelImpl.write(SocketChannelImpl.java:487) at >>> org.apache.cassandra.streaming.messages.StreamMessage.serialize(StreamMessage.java:44) >>> at >>> org.apache.cassandra.streaming.ConnectionHandler$OutgoingMessageHandler.sendMessage(ConnectionHandler.java:339) >>> at >>> org.apache.cassandra.streaming.ConnectionHandler$OutgoingMessageHandler.run(ConnectionHandler.java:311) >>> at java.lang.Thread.run(Thread.java:745)DEBUG [STREAM-OUT-/xxx.xxx.98.168] >>> 2016-09-27 04:24:49,764 ConnectionHandler.java (line 104) [Stream >>> #5e1b7f40-8496-11e6-8847-1b88665e430d] Closing stream connection handler on >>> /xxx.xxx.98.168 INFO [STREAM-OUT-/xxx.xxx.98.168] 2016-09-27 04:24:49,764 >>> StreamResultFuture.java (line 186) [Stream >>> #5e1b7f40-8496-11e6-8847-1b88665e430d] Session with /xxx.xxx.98.168 is >>> completeERROR [STREAM-OUT-/xxx.xxx.98.168] 2016-09-27 04:24:49,764 >>> StreamSession.java (line 461) [Stream >>> #5e1b7f40-8496-11e6-8847-1b88665e430d] Streaming error >>> occurredjava.io.IOException: Broken pipe at >>> sun.nio.ch.FileDispatcherImpl.write0(Native Method) at >>> sun.nio.ch.SocketDispatcher.write(SocketDispatcher.java:47) at >>> sun.nio.ch.IOUtil.writeFromNativeBuffer(IOUtil.java:93) at >>> sun.nio.ch.IOUtil.write(IOUtil.java:65) at >>> sun.nio.ch.SocketChannelImpl.write(SocketChannelImpl.java:487) at >>> org.apache.cassandra.streaming.messages.StreamMessage.serialize(StreamMessage.java:44) >>> at >>> org.apache.cassandra.streaming.ConnectionHandler$OutgoingMessageHandler.sendMessage(ConnectionHandler.java:339) >>> at >>> org.apache.cassandra.streaming.ConnectionHandler$OutgoingMessageHandler.run(ConnectionHandler.java:319) >>> at java.lang.Thread.run(Thread.java:745)DEBUG [STREAM-IN-/xxx.xxx.98.168] >>> 2016-09-27 04:24:49,909 ConnectionHandler.java (line 244) [Stream >>> #5e1b7f40-8496-11e6-8847-1b88665e430d] Received File (Header (cfId: >>> 68af9ee0-96f8-3b1d-a418-e5ae844f2cc2, #3, version: jb, estimated keys: >>> 4736, transfer size: 2306880, compressed?: true), file: >>> /home/cassandra/data_directories/data/keyspace_name1/archiving_metadata/keyspace_name1-archiving_metadata-tmp-jb-27-Data.db)ERROR >>> [STREAM-IN-/xxx.xxx.98.168] 2016-09-27 04:24:49,909 StreamSession.java >>> (line 461) [Stream #5e1b7f40-8496-11e6-8847-1b88665e430d] Streaming error >>> occurredjava.lang.RuntimeException: Outgoing stream handler has been >>> closed at >>> org.apache.cassandra.streaming.ConnectionHandler.sendMessage(ConnectionHandler.java:126) >>> at >>> org.apache.cassandra.streaming.StreamSession.receive(StreamSession.java:524) >>> at >>> org.apache.cassandra.streaming.StreamSession.messageReceived(StreamSession.java:413) >>> at >>> org.apache.cassandra.streaming.ConnectionHandler$IncomingMessageHandler.run(ConnectionHandler.java:245) >>> at java.lang.Thread.run(Thread.java:745)* >>> >>> >>> I checked with our network operations team , they have confirmed network >>> is stable and no network hiccups. >>> I have set 'streaming_socket_timeout_in_ms: 86400000' (24 hours) as >>> suggested in datastax blog - https://support.datastax.com/h >>> c/en-us/articles/206502913-FAQ-How-to-reduce-the-impact-of-s >>> treaming-errors-or-failures and ran 'nodetool rebuild' one node at a >>> time but was of NO USE . Still we are getting above exception. >>> >>> Can someone please help me in debugging and fixing this. >>> >>> >>> Thanks, >>> techpyaasa >>> >>> >>> >>> >> >