subject:"Node after restart sees other nodes down for 10 minutes"

Re: Node after restart sees other nodes down for 10 minutes

2016-07-27 Thread Paulo Motta

> This sounds like an issue that can potentially affect many users. Is it
not the case?

This seems to affect only some configurations, specially EC2, but not all
for some reason (it might be related to default tcp timeout configuration).

> Do we have a solution for this?

Watch https://issues.apache.org/jira/browse/CASSANDRA-9630 and add your
report there to reinforce the issue is still present on recent versions (it
has been a while that was reported and there were no newer reports so it
didn't get much attention).

2016-07-27 19:39 GMT-03:00 Farzad Panahi :

> Paulo,
>
> I can confirm that the problem is as you stated. Some or all of the other
> nodes are keeping a connection in CLOSE_WAIT state. Those nodes are seen as
> DN from the point of the node I have restarted the Cassandra service on.
> But nodetool disablegossip did not fix the problem.
>
> This sounds like an issue that can potentially affect many users. Is it
> not the case?
> Do we have a solution for this?
>
>
> 
> Here is netstat and nodetool status output
> 1. right after stopping cassandra service on 10.4.68.222:
> --
> ip-10-4-54-176
> tcp0  0 10.4.54.176:51268   10.4.68.222:7000
>TIME_WAIT
> tcp0  0 10.4.54.176:56135   10.4.68.222:7000
>TIME_WAIT
> tcp1  0 10.4.54.176:43697   10.4.68.222:7000
>CLOSE_WAIT
> tcp0  0 10.4.54.176:52372   10.4.68.222:7000
>TIME_WAIT
> --
> --
> ip-10-4-54-177
> tcp0  0 10.4.54.177:56960   10.4.68.222:7000
>TIME_WAIT
> tcp0  0 10.4.54.177:54539   10.4.68.222:7000
>TIME_WAIT
> tcp0  0 10.4.54.177:32823   10.4.68.222:7000
>TIME_WAIT
> tcp1  0 10.4.54.177:48985   10.4.68.222:7000
>CLOSE_WAIT
> --
> --
> ip-10-4-68-222
> tcp0  0 10.4.68.222:700010.4.54.176:43697
>   FIN_WAIT2
> tcp0  0 10.4.68.222:700010.4.54.177:48985
>   FIN_WAIT2
> tcp0  0 10.4.68.222:700010.4.68.222:54419
>   TIME_WAIT
> tcp0  0 10.4.68.222:700010.4.43.65:43197
>FIN_WAIT2
> tcp0  0 10.4.68.222:700010.4.68.221:44149
>   FIN_WAIT2
> tcp0  0 10.4.68.222:700010.4.68.222:41302
>   TIME_WAIT
> tcp0  0 10.4.68.222:700010.4.43.66:54321
>FIN_WAIT2
> --
> --
> ip-10-4-68-221
> tcp0  0 10.4.68.221:49599   10.4.68.222:7000
>TIME_WAIT
> tcp0  0 10.4.68.221:55033   10.4.68.222:7000
>TIME_WAIT
> tcp0  0 10.4.68.221:51628   10.4.68.222:7000
>TIME_WAIT
> tcp1  0 10.4.68.221:44149   10.4.68.222:7000
>CLOSE_WAIT
> --
> --
> ip-10-4-43-66
> tcp0  0 10.4.43.66:5593010.4.68.222:7000
>TIME_WAIT
> tcp1  0 10.4.43.66:5432110.4.68.222:7000
>CLOSE_WAIT
> tcp0  0 10.4.43.66:6096810.4.68.222:7000
>TIME_WAIT
> tcp0  0 10.4.43.66:4908710.4.68.222:7000
>TIME_WAIT
> --
> --
> ip-10-4-43-65
> tcp1  0 10.4.43.65:4319710.4.68.222:7000
>CLOSE_WAIT
> tcp0  0 10.4.43.65:3646710.4.68.222:7000
>TIME_WAIT
> tcp0  0 10.4.43.65:5331710.4.68.222:7000
>TIME_WAIT
> tcp0  0 10.4.43.65:5489710.4.68.222:7000
>TIME_WAIT
> --
> 2. a bit after stopping cassandra service on 10.4.68.222:
> --
> ip-10-4-54-176
> tcp1  0 10.4.54.176:43697   10.4.68.222:7000
>CLOSE_WAIT
> --
> --
> ip-10-4-54-177
> --
> --
> ip-10-4-68-222
> --
> --
> ip-10-4-68-221
> tcp1  0 10.4.68.221:44149   10.4.68.222:7000
>CLOSE_WAIT
> --
> --
> ip-10-4-43-66
> tcp1  0 10.4.43.66:5432110.4.68.222:7000
>CLOSE_WAIT
> --
> --
> ip-10-4-43-65
> tcp1  0 10.4.43.65:4319710.4.68.222:7000
>CLOSE_WAIT
> --
> 3. after starting cassandra service on 10.4.68.222:
>

Re: Node after restart sees other nodes down for 10 minutes

2016-07-27 Thread Farzad Panahi

Paulo,

I can confirm that the problem is as you stated. Some or all of the other
nodes are keeping a connection in CLOSE_WAIT state. Those nodes are seen as
DN from the point of the node I have restarted the Cassandra service on.
But nodetool disablegossip did not fix the problem.

This sounds like an issue that can potentially affect many users. Is it not
the case?
Do we have a solution for this?



Here is netstat and nodetool status output
1. right after stopping cassandra service on 10.4.68.222:
--
ip-10-4-54-176
tcp0  0 10.4.54.176:51268   10.4.68.222:7000
 TIME_WAIT
tcp0  0 10.4.54.176:56135   10.4.68.222:7000
 TIME_WAIT
tcp1  0 10.4.54.176:43697   10.4.68.222:7000
 CLOSE_WAIT
tcp0  0 10.4.54.176:52372   10.4.68.222:7000
 TIME_WAIT
--
--
ip-10-4-54-177
tcp0  0 10.4.54.177:56960   10.4.68.222:7000
 TIME_WAIT
tcp0  0 10.4.54.177:54539   10.4.68.222:7000
 TIME_WAIT
tcp0  0 10.4.54.177:32823   10.4.68.222:7000
 TIME_WAIT
tcp1  0 10.4.54.177:48985   10.4.68.222:7000
 CLOSE_WAIT
--
--
ip-10-4-68-222
tcp0  0 10.4.68.222:700010.4.54.176:43697
FIN_WAIT2
tcp0  0 10.4.68.222:700010.4.54.177:48985
FIN_WAIT2
tcp0  0 10.4.68.222:700010.4.68.222:54419
TIME_WAIT
tcp0  0 10.4.68.222:700010.4.43.65:43197
 FIN_WAIT2
tcp0  0 10.4.68.222:700010.4.68.221:44149
FIN_WAIT2
tcp0  0 10.4.68.222:700010.4.68.222:41302
TIME_WAIT
tcp0  0 10.4.68.222:700010.4.43.66:54321
 FIN_WAIT2
--
--
ip-10-4-68-221
tcp0  0 10.4.68.221:49599   10.4.68.222:7000
 TIME_WAIT
tcp0  0 10.4.68.221:55033   10.4.68.222:7000
 TIME_WAIT
tcp0  0 10.4.68.221:51628   10.4.68.222:7000
 TIME_WAIT
tcp1  0 10.4.68.221:44149   10.4.68.222:7000
 CLOSE_WAIT
--
--
ip-10-4-43-66
tcp0  0 10.4.43.66:5593010.4.68.222:7000
 TIME_WAIT
tcp1  0 10.4.43.66:5432110.4.68.222:7000
 CLOSE_WAIT
tcp0  0 10.4.43.66:6096810.4.68.222:7000
 TIME_WAIT
tcp0  0 10.4.43.66:4908710.4.68.222:7000
 TIME_WAIT
--
--
ip-10-4-43-65
tcp1  0 10.4.43.65:4319710.4.68.222:7000
 CLOSE_WAIT
tcp0  0 10.4.43.65:3646710.4.68.222:7000
 TIME_WAIT
tcp0  0 10.4.43.65:5331710.4.68.222:7000
 TIME_WAIT
tcp0  0 10.4.43.65:5489710.4.68.222:7000
 TIME_WAIT
--
2. a bit after stopping cassandra service on 10.4.68.222:
--
ip-10-4-54-176
tcp1  0 10.4.54.176:43697   10.4.68.222:7000
 CLOSE_WAIT
--
--
ip-10-4-54-177
--
--
ip-10-4-68-222
--
--
ip-10-4-68-221
tcp1  0 10.4.68.221:44149   10.4.68.222:7000
 CLOSE_WAIT
--
--
ip-10-4-43-66
tcp1  0 10.4.43.66:5432110.4.68.222:7000
 CLOSE_WAIT
--
--
ip-10-4-43-65
tcp1  0 10.4.43.65:4319710.4.68.222:7000
 CLOSE_WAIT
--
3. after starting cassandra service on 10.4.68.222:
--
ip-10-4-54-176
tcp0  0 10.4.54.176:42460   10.4.68.222:7000
 ESTABLISHED
tcp1 303403 10.4.54.176:43697   10.4.68.222:7000
 CLOSE_WAIT
tcp0  0 10.4.54.176:42109   10.4.68.222:7000
 ESTABLISHED
--
--
ip-10-4-54-177
tcp0  0 10.4.54.177:43687   10.4.68.222:7000
 ESTABLISHED
tcp0  0 10.4.54.177:56107   10.4.68.222:7000
 ESTABLISHED
tcp0  0 10.4.54.177:39426   10.4.68.222:7000
 ESTABLISHED
--
--
ip-10-4-68-222
tcp0  0 10.4.68.222:70000.0.0.0:*
LISTEN
tcp0  0 10.4.68.222:700010.4.54.176:42109
ESTABLISHED
tcp0  0 10.4.68.222:7000

Re: Node after restart sees other nodes down for 10 minutes

2016-07-27 Thread Farzad Panahi

Thanks Paulo for the reply.

Cassandra version is 3.0.8. I will test what you said and share the results.

On Wed, Jul 27, 2016 at 2:01 PM, Paulo Motta 
wrote:

> This looks somewhat related to CASSANDRA-9630. What is the C* version?
>
> Can you check with netstats if other nodes keep connections with the
> stopped node in the CLOSE_WAIT state? And also if the problem disappears if
> you run nodetool disablegossip before stopping the node?
>
> 2016-07-26 16:54 GMT-03:00 Farzad Panahi :
>
>> I am new to Cassandra and trying to figure out how the cluster behaves
>> when things go south.
>>
>> I have a 6-node cluster, RF=3.
>>
>> I stop Cassandra service on a node for a while. All nodes see the node as
>> DN. After a while I start the Cassandra service on DN. Interesting point is
>> that all other nodes see the node now as UN but the node itself sees 4
>> nodes as DN and only one node as UN. After about 10 minutes the node sees
>> other nodes as up as well.
>>
>> I am trying to figure out where this delay is coming from.
>>
>> I have attached part of system.log that looks interesting. Looks like
>> after Gossiper logs InetAddress  is now UP the node is actually seeing
>> that node as up even though the node has already handshaked with that node
>> before.
>>
>> Any ideas?
>>
>> Cheers
>>
>> Farzad
>>
>> --
>> INFO  [main] 2016-07-25 21:58:46,044 StorageService.java:533 - Cassandra
>> version: 3.0.8
>> INFO  [main] 2016-07-25 21:58:46,098 StorageService.java:534 - Thrift API
>> version: 20.1.0
>> INFO  [main] 2016-07-25 21:58:46,150 StorageService.java:535 - CQL
>> supported versions: 3.4.0 (default: 3.4.0)
>> INFO  [main] 2016-07-25 21:58:46,284 IndexSummaryManager.java:85 -
>> Initializing index summary manager with a memory pool size of 198 MB and a
>> resize interval of 60 minutes
>> INFO  [main] 2016-07-25 21:58:46,343 StorageService.java:554 - Loading
>> persisted ring state
>> INFO  [main] 2016-07-25 21:58:46,418 StorageService.java:743 - Starting
>> up server gossip
>> INFO  [main] 2016-07-25 21:58:46,680 TokenMetadata.java:429 - Updating
>> topology for ip-10-4-43-66.ec2.internal/10.4.43.66
>> INFO  [main] 2016-07-25 21:58:46,707 TokenMetadata.java:429 - Updating
>> topology for ip-10-4-43-66.ec2.internal/10.4.43.66
>> INFO  [main] 2016-07-25 21:58:46,792 MessagingService.java:557 - Starting
>> Messaging Service on ip-10-4-43-66.ec2.internal/10.4.43.66:7000 (eth0)
>>
>> INFO  [HANDSHAKE-/10.4.68.222] 2016-07-25 21:58:46,920
>> OutboundTcpConnection.java:515 - Handshaking version with /10.4.68.222
>> INFO  [GossipStage:1] 2016-07-25 21:58:47,011 Gossiper.java:1028 - Node /
>> 10.4.68.221 has restarted, now UP
>> INFO  [HANDSHAKE-/10.4.68.222] 2016-07-25 21:58:47,007
>> OutboundTcpConnection.java:515 - Handshaking version with /10.4.68.222
>> INFO  [main] 2016-07-25 21:58:47,030 StorageService.java:1902 - Node
>> ip-10-4-43-66.ec2.internal/10.4.43.66 state jump to NORMAL
>> INFO  [main] 2016-07-25 21:58:47,096 CassandraDaemon.java:644 - Waiting
>> for gossip to settle before accepting client requests...
>> INFO  [GossipStage:1] 2016-07-25 21:58:47,134 StorageService.java:1902 -
>> Node /10.4.68.221 state jump to NORMAL
>> INFO  [HANDSHAKE-/10.4.68.221] 2016-07-25 21:58:47,137
>> OutboundTcpConnection.java:515 - Handshaking version with /10.4.68.221
>> INFO  [GossipStage:1] 2016-07-25 21:58:47,211 TokenMetadata.java:429 -
>> Updating topology for /10.4.68.221
>> INFO  [GossipStage:1] 2016-07-25 21:58:47,261 TokenMetadata.java:429 -
>> Updating topology for /10.4.68.221
>> INFO  [GossipStage:1] 2016-07-25 21:58:47,295 Gossiper.java:1028 - Node /
>> 10.4.68.222 has restarted, now UP
>> INFO  [GossipStage:1] 2016-07-25 21:58:47,337 StorageService.java:1902 -
>> Node /10.4.68.222 state jump to NORMAL
>> INFO  [GossipStage:1] 2016-07-25 21:58:47,385 TokenMetadata.java:429 -
>> Updating topology for /10.4.68.222
>> INFO  [GossipStage:1] 2016-07-25 21:58:47,452 TokenMetadata.java:429 -
>> Updating topology for /10.4.68.222
>> INFO  [GossipStage:1] 2016-07-25 21:58:47,497 Gossiper.java:1028 - Node /
>> 10.4.54.176 has restarted, now UP
>> INFO  [GossipStage:1] 2016-07-25 21:58:47,544 StorageService.java:1902 -
>> Node /10.4.54.176 state jump to NORMAL
>> INFO  [HANDSHAKE-/10.4.54.176] 2016-07-25 21:58:47,548
>> OutboundTcpConnection.java:515 - Handshaking version with /10.4.54.176
>> INFO  [GossipStage:1] 2016-07-25 21:58:47,594 TokenMetadata.java:429 -
>> Updating topology for /10.4.54.176
>> INFO  [GossipStage:1] 2016-07-25 21:58:47,639 TokenMetadata.java:429 -
>> Updating topology for /10.4.54.176
>> WARN  [GossipTasks:1] 2016-07-25 21:58:47,678 FailureDetector.java:287 -
>> Not marking nodes down due to local pause of 43226235115 > 50
>> INFO  [HANDSHAKE-/10.4.43.65] 2016-07-25 21:58:47,679
>> OutboundTcpConnection.java:515 - Handshaking version with /10.4.43.65
>> INFO  [GossipStage:1] 2016-07-25 21:58:47,757

Re: Node after restart sees other nodes down for 10 minutes

2016-07-27 Thread Paulo Motta

This looks somewhat related to CASSANDRA-9630. What is the C* version?

Can you check with netstats if other nodes keep connections with the
stopped node in the CLOSE_WAIT state? And also if the problem disappears if
you run nodetool disablegossip before stopping the node?

2016-07-26 16:54 GMT-03:00 Farzad Panahi :

> I am new to Cassandra and trying to figure out how the cluster behaves
> when things go south.
>
> I have a 6-node cluster, RF=3.
>
> I stop Cassandra service on a node for a while. All nodes see the node as
> DN. After a while I start the Cassandra service on DN. Interesting point is
> that all other nodes see the node now as UN but the node itself sees 4
> nodes as DN and only one node as UN. After about 10 minutes the node sees
> other nodes as up as well.
>
> I am trying to figure out where this delay is coming from.
>
> I have attached part of system.log that looks interesting. Looks like
> after Gossiper logs InetAddress  is now UP the node is actually seeing
> that node as up even though the node has already handshaked with that node
> before.
>
> Any ideas?
>
> Cheers
>
> Farzad
>
> --
> INFO  [main] 2016-07-25 21:58:46,044 StorageService.java:533 - Cassandra
> version: 3.0.8
> INFO  [main] 2016-07-25 21:58:46,098 StorageService.java:534 - Thrift API
> version: 20.1.0
> INFO  [main] 2016-07-25 21:58:46,150 StorageService.java:535 - CQL
> supported versions: 3.4.0 (default: 3.4.0)
> INFO  [main] 2016-07-25 21:58:46,284 IndexSummaryManager.java:85 -
> Initializing index summary manager with a memory pool size of 198 MB and a
> resize interval of 60 minutes
> INFO  [main] 2016-07-25 21:58:46,343 StorageService.java:554 - Loading
> persisted ring state
> INFO  [main] 2016-07-25 21:58:46,418 StorageService.java:743 - Starting up
> server gossip
> INFO  [main] 2016-07-25 21:58:46,680 TokenMetadata.java:429 - Updating
> topology for ip-10-4-43-66.ec2.internal/10.4.43.66
> INFO  [main] 2016-07-25 21:58:46,707 TokenMetadata.java:429 - Updating
> topology for ip-10-4-43-66.ec2.internal/10.4.43.66
> INFO  [main] 2016-07-25 21:58:46,792 MessagingService.java:557 - Starting
> Messaging Service on ip-10-4-43-66.ec2.internal/10.4.43.66:7000 (eth0)
>
> INFO  [HANDSHAKE-/10.4.68.222] 2016-07-25 21:58:46,920
> OutboundTcpConnection.java:515 - Handshaking version with /10.4.68.222
> INFO  [GossipStage:1] 2016-07-25 21:58:47,011 Gossiper.java:1028 - Node /
> 10.4.68.221 has restarted, now UP
> INFO  [HANDSHAKE-/10.4.68.222] 2016-07-25 21:58:47,007
> OutboundTcpConnection.java:515 - Handshaking version with /10.4.68.222
> INFO  [main] 2016-07-25 21:58:47,030 StorageService.java:1902 - Node
> ip-10-4-43-66.ec2.internal/10.4.43.66 state jump to NORMAL
> INFO  [main] 2016-07-25 21:58:47,096 CassandraDaemon.java:644 - Waiting
> for gossip to settle before accepting client requests...
> INFO  [GossipStage:1] 2016-07-25 21:58:47,134 StorageService.java:1902 -
> Node /10.4.68.221 state jump to NORMAL
> INFO  [HANDSHAKE-/10.4.68.221] 2016-07-25 21:58:47,137
> OutboundTcpConnection.java:515 - Handshaking version with /10.4.68.221
> INFO  [GossipStage:1] 2016-07-25 21:58:47,211 TokenMetadata.java:429 -
> Updating topology for /10.4.68.221
> INFO  [GossipStage:1] 2016-07-25 21:58:47,261 TokenMetadata.java:429 -
> Updating topology for /10.4.68.221
> INFO  [GossipStage:1] 2016-07-25 21:58:47,295 Gossiper.java:1028 - Node /
> 10.4.68.222 has restarted, now UP
> INFO  [GossipStage:1] 2016-07-25 21:58:47,337 StorageService.java:1902 -
> Node /10.4.68.222 state jump to NORMAL
> INFO  [GossipStage:1] 2016-07-25 21:58:47,385 TokenMetadata.java:429 -
> Updating topology for /10.4.68.222
> INFO  [GossipStage:1] 2016-07-25 21:58:47,452 TokenMetadata.java:429 -
> Updating topology for /10.4.68.222
> INFO  [GossipStage:1] 2016-07-25 21:58:47,497 Gossiper.java:1028 - Node /
> 10.4.54.176 has restarted, now UP
> INFO  [GossipStage:1] 2016-07-25 21:58:47,544 StorageService.java:1902 -
> Node /10.4.54.176 state jump to NORMAL
> INFO  [HANDSHAKE-/10.4.54.176] 2016-07-25 21:58:47,548
> OutboundTcpConnection.java:515 - Handshaking version with /10.4.54.176
> INFO  [GossipStage:1] 2016-07-25 21:58:47,594 TokenMetadata.java:429 -
> Updating topology for /10.4.54.176
> INFO  [GossipStage:1] 2016-07-25 21:58:47,639 TokenMetadata.java:429 -
> Updating topology for /10.4.54.176
> WARN  [GossipTasks:1] 2016-07-25 21:58:47,678 FailureDetector.java:287 -
> Not marking nodes down due to local pause of 43226235115 > 50
> INFO  [HANDSHAKE-/10.4.43.65] 2016-07-25 21:58:47,679
> OutboundTcpConnection.java:515 - Handshaking version with /10.4.43.65
> INFO  [GossipStage:1] 2016-07-25 21:58:47,757 Gossiper.java:1028 - Node /
> 10.4.54.177 has restarted, now UP
> INFO  [GossipStage:1] 2016-07-25 21:58:47,788 StorageService.java:1902 -
> Node /10.4.54.177 state jump to NORMAL
> INFO  [HANDSHAKE-/10.4.54.177] 2016-07-25 21:58:47,789
> OutboundTcpConnection.java:515 - Handshaking

Node after restart sees other nodes down for 10 minutes

2016-07-26 Thread Farzad Panahi

I am new to Cassandra and trying to figure out how the cluster behaves when
things go south.

I have a 6-node cluster, RF=3.

I stop Cassandra service on a node for a while. All nodes see the node as
DN. After a while I start the Cassandra service on DN. Interesting point is
that all other nodes see the node now as UN but the node itself sees 4
nodes as DN and only one node as UN. After about 10 minutes the node sees
other nodes as up as well.

I am trying to figure out where this delay is coming from.

I have attached part of system.log that looks interesting. Looks like after
Gossiper logs InetAddress  is now UP the node is actually seeing that
node as up even though the node has already handshaked with that node
before.

Any ideas?

Cheers

Farzad

--
INFO  [main] 2016-07-25 21:58:46,044 StorageService.java:533 - Cassandra
version: 3.0.8
INFO  [main] 2016-07-25 21:58:46,098 StorageService.java:534 - Thrift API
version: 20.1.0
INFO  [main] 2016-07-25 21:58:46,150 StorageService.java:535 - CQL
supported versions: 3.4.0 (default: 3.4.0)
INFO  [main] 2016-07-25 21:58:46,284 IndexSummaryManager.java:85 -
Initializing index summary manager with a memory pool size of 198 MB and a
resize interval of 60 minutes
INFO  [main] 2016-07-25 21:58:46,343 StorageService.java:554 - Loading
persisted ring state
INFO  [main] 2016-07-25 21:58:46,418 StorageService.java:743 - Starting up
server gossip
INFO  [main] 2016-07-25 21:58:46,680 TokenMetadata.java:429 - Updating
topology for ip-10-4-43-66.ec2.internal/10.4.43.66
INFO  [main] 2016-07-25 21:58:46,707 TokenMetadata.java:429 - Updating
topology for ip-10-4-43-66.ec2.internal/10.4.43.66
INFO  [main] 2016-07-25 21:58:46,792 MessagingService.java:557 - Starting
Messaging Service on ip-10-4-43-66.ec2.internal/10.4.43.66:7000 (eth0)

INFO  [HANDSHAKE-/10.4.68.222] 2016-07-25 21:58:46,920
OutboundTcpConnection.java:515 - Handshaking version with /10.4.68.222
INFO  [GossipStage:1] 2016-07-25 21:58:47,011 Gossiper.java:1028 - Node /
10.4.68.221 has restarted, now UP
INFO  [HANDSHAKE-/10.4.68.222] 2016-07-25 21:58:47,007
OutboundTcpConnection.java:515 - Handshaking version with /10.4.68.222
INFO  [main] 2016-07-25 21:58:47,030 StorageService.java:1902 - Node
ip-10-4-43-66.ec2.internal/10.4.43.66 state jump to NORMAL
INFO  [main] 2016-07-25 21:58:47,096 CassandraDaemon.java:644 - Waiting for
gossip to settle before accepting client requests...
INFO  [GossipStage:1] 2016-07-25 21:58:47,134 StorageService.java:1902 -
Node /10.4.68.221 state jump to NORMAL
INFO  [HANDSHAKE-/10.4.68.221] 2016-07-25 21:58:47,137
OutboundTcpConnection.java:515 - Handshaking version with /10.4.68.221
INFO  [GossipStage:1] 2016-07-25 21:58:47,211 TokenMetadata.java:429 -
Updating topology for /10.4.68.221
INFO  [GossipStage:1] 2016-07-25 21:58:47,261 TokenMetadata.java:429 -
Updating topology for /10.4.68.221
INFO  [GossipStage:1] 2016-07-25 21:58:47,295 Gossiper.java:1028 - Node /
10.4.68.222 has restarted, now UP
INFO  [GossipStage:1] 2016-07-25 21:58:47,337 StorageService.java:1902 -
Node /10.4.68.222 state jump to NORMAL
INFO  [GossipStage:1] 2016-07-25 21:58:47,385 TokenMetadata.java:429 -
Updating topology for /10.4.68.222
INFO  [GossipStage:1] 2016-07-25 21:58:47,452 TokenMetadata.java:429 -
Updating topology for /10.4.68.222
INFO  [GossipStage:1] 2016-07-25 21:58:47,497 Gossiper.java:1028 - Node /
10.4.54.176 has restarted, now UP
INFO  [GossipStage:1] 2016-07-25 21:58:47,544 StorageService.java:1902 -
Node /10.4.54.176 state jump to NORMAL
INFO  [HANDSHAKE-/10.4.54.176] 2016-07-25 21:58:47,548
OutboundTcpConnection.java:515 - Handshaking version with /10.4.54.176
INFO  [GossipStage:1] 2016-07-25 21:58:47,594 TokenMetadata.java:429 -
Updating topology for /10.4.54.176
INFO  [GossipStage:1] 2016-07-25 21:58:47,639 TokenMetadata.java:429 -
Updating topology for /10.4.54.176
WARN  [GossipTasks:1] 2016-07-25 21:58:47,678 FailureDetector.java:287 -
Not marking nodes down due to local pause of 43226235115 > 50
INFO  [HANDSHAKE-/10.4.43.65] 2016-07-25 21:58:47,679
OutboundTcpConnection.java:515 - Handshaking version with /10.4.43.65
INFO  [GossipStage:1] 2016-07-25 21:58:47,757 Gossiper.java:1028 - Node /
10.4.54.177 has restarted, now UP
INFO  [GossipStage:1] 2016-07-25 21:58:47,788 StorageService.java:1902 -
Node /10.4.54.177 state jump to NORMAL
INFO  [HANDSHAKE-/10.4.54.177] 2016-07-25 21:58:47,789
OutboundTcpConnection.java:515 - Handshaking version with /10.4.54.177
INFO  [GossipStage:1] 2016-07-25 21:58:47,836 TokenMetadata.java:429 -
Updating topology for /10.4.54.177
INFO  [GossipStage:1] 2016-07-25 21:58:47,887 TokenMetadata.java:429 -
Updating topology for /10.4.54.177
INFO  [GossipStage:1] 2016-07-25 21:58:47,926 Gossiper.java:1028 - Node /
10.4.43.65 has restarted, now UP
INFO  [GossipStage:1] 2016-07-25 21:58:47,976 StorageService.java:1902 -
Node /10.4.43.65 state jump to NORMAL
INFO  [GossipStage:1] 2016-07-25 21:58:48,036

Re: Node after restart sees other nodes down for 10 minutes

Re: Node after restart sees other nodes down for 10 minutes

Re: Node after restart sees other nodes down for 10 minutes

Re: Node after restart sees other nodes down for 10 minutes

Node after restart sees other nodes down for 10 minutes

5 matches

Site Navigation

Mail list logo

Footer information