Thank you Bowen - yeah the only ERROR I see in
/var/log/cassandra/debug.log is:
ERROR [main] 2024-08-15 04:48:23,374 StorageService.java:2041 - Error
while waiting on bootstrap to complete. Bootstrap will have to be restarted.
java.util.concurrent.ExecutionException:
org.apache.cassandra.streaming.StreamException: Stream failed
at
org.apache.cassandra.utils.concurrent.AbstractFuture.getWhenDone(AbstractFuture.java:239)
at
org.apache.cassandra.utils.concurrent.AbstractFuture.get(AbstractFuture.java:246)
at
org.apache.cassandra.service.StorageService.bootstrap(StorageService.java:2034)
at
org.apache.cassandra.service.StorageService.joinTokenRing(StorageService.java:1185)
at
org.apache.cassandra.service.StorageService.joinTokenRing(StorageService.java:1145)
at
org.apache.cassandra.service.StorageService.initServer(StorageService.java:936)
at
org.apache.cassandra.service.StorageService.initServer(StorageService.java:854)
at
org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:421)
at
org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:744)
at
org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:878)
Caused by: org.apache.cassandra.streaming.StreamException: Stream failed
at
org.apache.cassandra.streaming.StreamResultFuture.maybeComplete(StreamResultFuture.java:243)
at
org.apache.cassandra.streaming.StreamResultFuture.handleSessionComplete(StreamResultFuture.java:205)
at
org.apache.cassandra.streaming.StreamSession.lambda$closeSession$2(StreamSession.java:517)
at
org.apache.cassandra.concurrent.FutureTask$1.call(FutureTask.java:96)
at
org.apache.cassandra.concurrent.FutureTask.call(FutureTask.java:61)
at
org.apache.cassandra.concurrent.FutureTask.run(FutureTask.java:71)
at
java.base/java.util.concurrent.Executors$RunnableAdapter.call(Unknown
Source)
at java.base/java.util.concurrent.FutureTask.run(Unknown Source)
at
java.base/java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(Unknown
Source)
at
java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)
at
java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
at
io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
at java.base/java.lang.Thread.run(Unknown Source)
The cluster is running inside of kubernetes on bare metal with a netapp
for storage. I'd love a way to double the number of nodes, but sounds
like I shouldn't have let it get this far. We're having some odd
performance issues on reads, that I'm diagnosing.
-Joe
On 8/14/2024 5:07 PM, Bowen Song via user wrote:
It looks like all your nodes are in the same DC and the same rack with
256 vnodes each. It's very hard (if not impossible) to add multiple
nodes to the same DC concurrently and safely in this setup. You are
better off adding one node at a time to this cluster.
Try search for "ERROR" in the logs, it should tell you why did the
streaming session fail. If you can find the cause of the failure, you
may be able to prevent or reduce the chance of it happening again in
the future.
On 14/08/2024 21:50, Joe Obernberger wrote:
Hi all - when adding a node to our existing 15 node cluster, I get:
DEBUG [NonPeriodicTasks:1] 2024-08-14 20:34:10,383
StreamCoordinator.java:152 - Finished connecting all sessions
WARN [NonPeriodicTasks:1] 2024-08-14 20:34:10,385
StreamResultFuture.java:242 - [Stream
#d7bf9f60-5a5e-11ef-aa71-51cb94e3c01f] Stream failed
DEBUG [NonPeriodicTasks:1] 2024-08-14 20:34:10,386
StreamSession.java:529 - [Stream
#d7bf9f60-5a5e-11ef-aa71-51cb94e3c01f] Will close attached inbound
{3d52d53f=org.apache.cassandra.streaming.async.NettyStreamingChannel@2e6b7635,
67655215=org.apache.cassandra.streaming.async.NettyStreamingChannel@11e1a7e5,
d4d397f3=org.apache.cassandra.streaming.async.NettyStreamingChannel@11959fa4,
50e7cefb=org.apache.cassandra.streaming.async.NettyStreamingChannel@7a69edc0,
dfbbe8cd=org.apache.cassandra.streaming.async.NettyStreamingChannel@f46e666,
55d116ad=org.apache.cassandra.streaming.async.NettyStreamingChannel@1304a6e4,
5cf05913=org.apache.cassandra.streaming.async.NettyStreamingChannel@2ed1739f,
833323b4=org.apache.cassandra.streaming.async.NettyStreamingChannel@1be38c68,
1146e1c5=org.apache.cassandra.streaming.async.NettyStreamingChannel@232e3ca4,
f983d95c=org.apache.cassandra.streaming.async.NettyStreamingChannel@152b8da3,
3f061317=org.apache.cassandra.streaming.async.NettyStreamingChannel@30434912,
0c0b1395=org.apache.cassandra.streaming.async.NettyStreamingChannel@374403be,
30dafd8c=org.apache.cassandra.streaming.async.NettyStreamingChannel@65ddc2ee,
28c0fe3c=org.apache.cassandra.streaming.async.NettyStreamingChannel@2cd20d63,
55d1221c=org.apache.cassandra.streaming.async.NettyStreamingChannel@6f3ca32,
4fca9827=org.apache.cassandra.streaming.async.NettyStreamingChannel@113d70a1,
bea79e0c=org.apache.cassandra.streaming.async.NettyStreamingChannel@1afc0ada}
and outbound
{0c0b1395=org.apache.cassandra.streaming.async.NettyStreamingChannel@374403be}
channels
DEBUG [Stream-Deserializer-/192.168.189.127:7000-833323b4] 2024-08-14
20:34:10,387 StreamSession.java:677 - [Stream
#d7bf9f60-5a5e-11ef-aa71-51cb94e3c01f] Socket closed after session
completed with state COMPLETE
DEBUG [Stream-Deserializer-/192.168.189.127:7000-50e7cefb] 2024-08-14
20:34:10,387 StreamSession.java:677 - [Stream
#d7bf9f60-5a5e-11ef-aa71-51cb94e3c01f] Socket closed after session
completed with state COMPLETE
DEBUG [Stream-Deserializer-/192.168.189.127:7000-dfbbe8cd] 2024-08-14
20:34:10,387 StreamSession.java:677 - [Stream
#d7bf9f60-5a5e-11ef-aa71-51cb94e3c01f] Socket closed after session
completed with state COMPLETE
DEBUG [Stream-Deserializer-/192.168.189.127:7000-67655215] 2024-08-14
20:34:10,387 StreamSession.java:677 - [Stream
#d7bf9f60-5a5e-11ef-aa71-51cb94e3c01f] Socket closed after session
completed with state COMPLETE
DEBUG [Stream-Deserializer-/192.168.189.127:7000-55d116ad] 2024-08-14
20:34:10,387 StreamSession.java:677 - [Stream
#d7bf9f60-5a5e-11ef-aa71-51cb94e3c01f] Socket closed after session
completed with state COMPLETE
DEBUG [Stream-Deserializer-/192.168.189.127:7000-1146e1c5] 2024-08-14
20:34:10,387 StreamSession.java:677 - [Stream
#d7bf9f60-5a5e-11ef-aa71-51cb94e3c01f] Socket closed after session
completed with state COMPLETE
DEBUG [Stream-Deserializer-/192.168.189.127:7000-f983d95c] 2024-08-14
20:34:10,387 StreamSession.java:677 - [Stream
#d7bf9f60-5a5e-11ef-aa71-51cb94e3c01f] Socket closed after session
completed with state COMPLETE
DEBUG [Stream-Deserializer-/192.168.189.127:7000-5cf05913] 2024-08-14
20:34:10,387 StreamSession.java:677 - [Stream
#d7bf9f60-5a5e-11ef-aa71-51cb94e3c01f] Socket closed after session
completed with state COMPLETE
DEBUG [Stream-Deserializer-/192.168.189.127:7000-3f061317] 2024-08-14
20:34:10,387 StreamSession.java:677 - [Stream
#d7bf9f60-5a5e-11ef-aa71-51cb94e3c01f] Socket closed after session
completed with state COMPLETE
DEBUG [Stream-Deserializer-/192.168.189.127:7000-0c0b1395] 2024-08-14
20:34:10,387 StreamSession.java:677 - [Stream
#d7bf9f60-5a5e-11ef-aa71-51cb94e3c01f] Socket closed after session
completed with state COMPLETE
DEBUG [Stream-Deserializer-/192.168.189.127:7000-d4d397f3] 2024-08-14
20:34:10,387 StreamSession.java:677 - [Stream
#d7bf9f60-5a5e-11ef-aa71-51cb94e3c01f] Socket closed after session
completed with state COMPLETE
DEBUG [Stream-Deserializer-/192.168.189.127:7000-28c0fe3c] 2024-08-14
20:34:10,387 StreamSession.java:677 - [Stream
#d7bf9f60-5a5e-11ef-aa71-51cb94e3c01f] Socket closed after session
completed with state COMPLETE
DEBUG [Stream-Deserializer-/192.168.189.127:7000-3d52d53f] 2024-08-14
20:34:10,387 StreamSession.java:677 - [Stream
#d7bf9f60-5a5e-11ef-aa71-51cb94e3c01f] Socket closed after session
completed with state COMPLETE
DEBUG [Stream-Deserializer-/192.168.189.127:7000-55d1221c] 2024-08-14
20:34:10,387 StreamSession.java:677 - [Stream
#d7bf9f60-5a5e-11ef-aa71-51cb94e3c01f] Socket closed after session
completed with state COMPLETE
DEBUG [Stream-Deserializer-/192.168.189.127:7000-30dafd8c] 2024-08-14
20:34:10,387 StreamSession.java:677 - [Stream
#d7bf9f60-5a5e-11ef-aa71-51cb94e3c01f] Socket closed after session
completed with state COMPLETE
DEBUG [Stream-Deserializer-/192.168.189.127:7000-4fca9827] 2024-08-14
20:34:10,387 StreamSession.java:677 - [Stream
#d7bf9f60-5a5e-11ef-aa71-51cb94e3c01f] Socket closed after session
completed with state COMPLETE
DEBUG [Stream-Deserializer-/192.168.189.127:7000-bea79e0c] 2024-08-14
20:34:10,387 StreamSession.java:677 - [Stream
#d7bf9f60-5a5e-11ef-aa71-51cb94e3c01f] Socket closed after session
completed with state COMPLETE
ERROR [main] 2024-08-14 20:34:10,387 StorageService.java:2041 - Error
while waiting on bootstrap to complete. Bootstrap will have to be
restarted.
java.util.concurrent.ExecutionException:
org.apache.cassandra.streaming.StreamException: Stream failed
at
org.apache.cassandra.utils.concurrent.AbstractFuture.getWhenDone(AbstractFuture.java:239)
at
org.apache.cassandra.utils.concurrent.AbstractFuture.get(AbstractFuture.java:246)
at
org.apache.cassandra.service.StorageService.bootstrap(StorageService.java:2034)
at
org.apache.cassandra.service.StorageService.joinTokenRing(StorageService.java:1185)
at
org.apache.cassandra.service.StorageService.joinTokenRing(StorageService.java:1145)
at
org.apache.cassandra.service.StorageService.initServer(StorageService.java:936)
at
org.apache.cassandra.service.StorageService.initServer(StorageService.java:854)
at
org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:421)
at
org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:744)
at
org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:878)
Caused by: org.apache.cassandra.streaming.StreamException: Stream failed
at
org.apache.cassandra.streaming.StreamResultFuture.maybeComplete(StreamResultFuture.java:243)
at
org.apache.cassandra.streaming.StreamResultFuture.handleSessionComplete(StreamResultFuture.java:205)
at
org.apache.cassandra.streaming.StreamSession.lambda$closeSession$2(StreamSession.java:517)
at
org.apache.cassandra.concurrent.FutureTask$1.call(FutureTask.java:96)
at
org.apache.cassandra.concurrent.FutureTask.call(FutureTask.java:61)
at
org.apache.cassandra.concurrent.FutureTask.run(FutureTask.java:71)
at
java.base/java.util.concurrent.Executors$RunnableAdapter.call(Unknown
Source)
at java.base/java.util.concurrent.FutureTask.run(Unknown Source)
at
java.base/java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(Unknown
Source)
at
java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown
Source)
at
java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown
Source)
at
io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
at java.base/java.lang.Thread.run(Unknown Source)
DEBUG [Messaging-EventLoop-3-15] 2024-08-14 20:34:10,388
StreamingMultiplexedChannel.java:513 - [Stream
#d7bf9f60-5a5e-11ef-aa71-51cb94e3c01f] Closing stream connection
channels on /192.168.189.127:7000
WARN [main] 2024-08-14 20:34:10,456 StorageService.java:1221 - Some
data streaming failed. Use nodetool to check bootstrap state and
resume. For more, see `nodetool help bootstrap`. IN_PROGRESS
INFO [main] 2024-08-14 20:34:10,458 Gossiper.java:2293 - Waiting for
gossip to settle...
DEBUG [main] 2024-08-14 20:34:16,458 Gossiper.java:2305 - Gossip
looks settled.
root@cassandra-15:/# nodetool status -r
Datacenter: dc1
===============
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
-- Address Load Tokens Owns (effective) Host
ID Rack
UN cassandra-6.cassandra.cassandra-jos.svc.cluster.local 2.07
TiB 256 20.0% 34a419fb-4db1-4fb4-8ea7-c0988b2a3d5a rack1
UN cassandra-3.cassandra.cassandra-jos.svc.cluster.local 2.08
TiB 256 20.0% 354a9e1f-71b4-4aba-9f03-26778aaa17e2 rack1
UN cassandra-14.cassandra.cassandra-jos.svc.cluster.local 1.95
TiB 256 20.0% 6e748875-1101-4bc1-a132-c6fe643c6148 rack1
UN cassandra-5.cassandra.cassandra-jos.svc.cluster.local 2.07
TiB 256 20.0% ead07b83-a927-4904-bc6c-e8e4249f7560 rack1
UN cassandra-7.cassandra.cassandra-jos.svc.cluster.local 2.08
TiB 256 20.0% b5fbc09e-565e-402d-a6d0-3feac5922a13 rack1
UN cassandra-4.cassandra.cassandra-jos.svc.cluster.local 2.27
TiB 256 20.0% 14fa3f70-a8e2-4f60-8bc9-42c7b3294718 rack1
UN cassandra-1.cassandra.cassandra-jos.svc.cluster.local 2.07
TiB 256 20.0% ea5979b8-71fe-4af5-b856-db06e008ced3 rack1
UN cassandra-11.cassandra.cassandra-jos.svc.cluster.local 2.08
TiB 256 20.0% f2bb43f7-8381-4b28-8ced-194060267c29 rack1
UN cassandra-9.cassandra.cassandra-jos.svc.cluster.local 2.07
TiB 256 20.0% ed1ce133-4473-4bc1-a0cc-a242d6f67acc rack1
UN cassandra-2.cassandra.cassandra-jos.svc.cluster.local 2.08
TiB 256 20.0% 2b5db930-2bfe-4829-b353-3b1db0ca368f rack1
UN cassandra-12.cassandra.cassandra-jos.svc.cluster.local 2.07
TiB 256 20.0% 1d3ed71f-18b4-4cc0-9c6e-3a5479328496 rack1
UJ cassandra-15.cassandra.cassandra-jos.svc.cluster.local 874.82
GiB 256 ? 9d68cd5a-6e48-4156-92c1-5c51f02bce65 rack1
UN cassandra-0.cassandra.cassandra-jos.svc.cluster.local 2.14
TiB 256 20.0% 5c3d3b46-300d-49f2-b01d-cbdb44d98022 rack1
UN cassandra-8.cassandra.cassandra-jos.svc.cluster.local 2.07
TiB 256 20.0% fb2e8220-549e-4316-bb78-53f6cd07318a rack1
UN cassandra-10.cassandra.cassandra-jos.svc.cluster.local 2.07
TiB 256 20.0% a2ab4d22-b564-4d6b-b743-2d78598fe53c rack1
UN cassandra-13.cassandra.cassandra-jos.svc.cluster.local 2.07
TiB 256 20.0% ed865baf-5333-40e5-8c73-9de2df2bd330 rack1
If I restart the bootstrap, it usually completes, but I'd like to
double the size of the cluster, and that's a very long operation. Is
there anyway to add multiple nodes at once?
Thanks!
-Joe
--
This email has been checked for viruses by AVG antivirus software.
www.avg.com