[jira] [Commented] (CASSANDRA-6127) vnodes don't scale to hundreds of nodes

2013-11-20 Thread Quentin Conner (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6127?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13827750#comment-13827750
 ] 

Quentin Conner commented on CASSANDRA-6127:
---

Yes, both use case 1 and use case 2 (detailed in early comment above) were 
cured by patch #3.  Zero flaps were recorded in multiple trials in both use 
cases.  Patch #3 cures the flaps, but does not address the cpu usage symptom.

This was tested against the cassandra-1.2 branch.  I am conducting the same 
test today against use case 2 today, but using the current cassandra-2.0 branch 
of source.

 vnodes don't scale to hundreds of nodes
 ---

 Key: CASSANDRA-6127
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6127
 Project: Cassandra
  Issue Type: Bug
  Components: Core
 Environment: Any cluster that has vnodes and consists of hundreds of 
 physical nodes.
Reporter: Tupshin Harper
Assignee: Jonathan Ellis
 Attachments: 2013-11-05_18-04-03_no_compression_cpu_time.png, 
 2013-11-05_18-09-38_compression_on_cpu_time.png, 6000vnodes.patch, 
 AdjustableGossipPeriod.patch, cpu-vs-token-graph.png, 
 delayEstimatorUntilStatisticallyValid.patch, flaps-vs-tokens.png


 There are a lot of gossip-related issues related to very wide clusters that 
 also have vnodes enabled. Let's use this ticket as a master in case there are 
 sub-tickets.
 The most obvious symptom I've seen is with 1000 nodes in EC2 with m1.xlarge 
 instances. Each node configured with 32 vnodes.
 Without vnodes, cluster spins up fine and is ready to handle requests within 
 30 minutes or less. 
 With vnodes, nodes are reporting constant up/down flapping messages with no 
 external load on the cluster. After a couple of hours, they were still 
 flapping, had very high cpu load, and the cluster never looked like it was 
 going to stabilize or be useful for traffic.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (CASSANDRA-6127) vnodes don't scale to hundreds of nodes

2013-11-19 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6127?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13826949#comment-13826949
 ] 

Jonathan Ellis commented on CASSANDRA-6127:
---

bq. Untested patch #3. Delays output from FailureDetector until statistically 
valid number of samples have been obtained.

Did we ever find a scenario where we can demonstrate this patch making a 
difference?  Because I think it's a good idea in theory.

 vnodes don't scale to hundreds of nodes
 ---

 Key: CASSANDRA-6127
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6127
 Project: Cassandra
  Issue Type: Bug
  Components: Core
 Environment: Any cluster that has vnodes and consists of hundreds of 
 physical nodes.
Reporter: Tupshin Harper
Assignee: Jonathan Ellis
 Attachments: 2013-11-05_18-04-03_no_compression_cpu_time.png, 
 2013-11-05_18-09-38_compression_on_cpu_time.png, 6000vnodes.patch, 
 AdjustableGossipPeriod.patch, cpu-vs-token-graph.png, 
 delayEstimatorUntilStatisticallyValid.patch, flaps-vs-tokens.png


 There are a lot of gossip-related issues related to very wide clusters that 
 also have vnodes enabled. Let's use this ticket as a master in case there are 
 sub-tickets.
 The most obvious symptom I've seen is with 1000 nodes in EC2 with m1.xlarge 
 instances. Each node configured with 32 vnodes.
 Without vnodes, cluster spins up fine and is ready to handle requests within 
 30 minutes or less. 
 With vnodes, nodes are reporting constant up/down flapping messages with no 
 external load on the cluster. After a couple of hours, they were still 
 flapping, had very high cpu load, and the cluster never looked like it was 
 going to stabilize or be useful for traffic.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (CASSANDRA-6127) vnodes don't scale to hundreds of nodes

2013-11-12 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6127?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13820669#comment-13820669
 ] 

Jonathan Ellis commented on CASSANDRA-6127:
---

bq. ISTM that FD processing Gossip updates synchronously is a fundamental 
problem. Any hiccup in processing will cause FD false positives.

I've pulled a fix for this out to CASSANDRA-6338.

 vnodes don't scale to hundreds of nodes
 ---

 Key: CASSANDRA-6127
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6127
 Project: Cassandra
  Issue Type: Bug
  Components: Core
 Environment: Any cluster that has vnodes and consists of hundreds of 
 physical nodes.
Reporter: Tupshin Harper
Assignee: Jonathan Ellis
 Attachments: 2013-11-05_18-04-03_no_compression_cpu_time.png, 
 2013-11-05_18-09-38_compression_on_cpu_time.png, 6000vnodes.patch, 
 AdjustableGossipPeriod.patch, cpu-vs-token-graph.png, 
 delayEstimatorUntilStatisticallyValid.patch, flaps-vs-tokens.png


 There are a lot of gossip-related issues related to very wide clusters that 
 also have vnodes enabled. Let's use this ticket as a master in case there are 
 sub-tickets.
 The most obvious symptom I've seen is with 1000 nodes in EC2 with m1.xlarge 
 instances. Each node configured with 32 vnodes.
 Without vnodes, cluster spins up fine and is ready to handle requests within 
 30 minutes or less. 
 With vnodes, nodes are reporting constant up/down flapping messages with no 
 external load on the cluster. After a couple of hours, they were still 
 flapping, had very high cpu load, and the cluster never looked like it was 
 going to stabilize or be useful for traffic.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (CASSANDRA-6127) vnodes don't scale to hundreds of nodes

2013-11-07 Thread Quentin Conner (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6127?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13816037#comment-13816037
 ] 

Quentin Conner commented on CASSANDRA-6127:
---

Good morning.  We saw the same CPU usage profile with cassandra-1.2 
8e7d7285cdeac4f2527c933280d595bbddd26935 (which included the patch to not flush 
peers CF).  

CPU time was spent in looking up EndpointState or spent in PHI calculation.  No 
surprises were found.  No race conditions, no deadlocks or mutex/monitor 
contention.

I do not know if flapping happens in 1.2 head without vnodes.  I will find out 
today, if I can get the nodes (having trouble this morning allocating from 
EC2).  Will keep trying (Fridays seem better) but could slip into the weekend...


 vnodes don't scale to hundreds of nodes
 ---

 Key: CASSANDRA-6127
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6127
 Project: Cassandra
  Issue Type: Bug
  Components: Core
 Environment: Any cluster that has vnodes and consists of hundreds of 
 physical nodes.
Reporter: Tupshin Harper
Assignee: Jonathan Ellis
 Attachments: 2013-11-05_18-04-03_no_compression_cpu_time.png, 
 2013-11-05_18-09-38_compression_on_cpu_time.png, 6000vnodes.patch, 
 AdjustableGossipPeriod.patch, delayEstimatorUntilStatisticallyValid.patch


 There are a lot of gossip-related issues related to very wide clusters that 
 also have vnodes enabled. Let's use this ticket as a master in case there are 
 sub-tickets.
 The most obvious symptom I've seen is with 1000 nodes in EC2 with m1.xlarge 
 instances. Each node configured with 32 vnodes.
 Without vnodes, cluster spins up fine and is ready to handle requests within 
 30 minutes or less. 
 With vnodes, nodes are reporting constant up/down flapping messages with no 
 external load on the cluster. After a couple of hours, they were still 
 flapping, had very high cpu load, and the cluster never looked like it was 
 going to stabilize or be useful for traffic.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (CASSANDRA-6127) vnodes don't scale to hundreds of nodes

2013-11-07 Thread Quentin Conner (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6127?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13816044#comment-13816044
 ] 

Quentin Conner commented on CASSANDRA-6127:
---

Tupshin, can you further quantify the CPU usage you observed, in terms of USER 
CPU and KERNEL CPU?
Also, can you confirm the number of nodes and vnodes for those observations.

I've seen about 25% user cpu @ 256 nodes and 60% @ 512 nodes.  Kernel cpu was 
under 5% for both in my trials.

 vnodes don't scale to hundreds of nodes
 ---

 Key: CASSANDRA-6127
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6127
 Project: Cassandra
  Issue Type: Bug
  Components: Core
 Environment: Any cluster that has vnodes and consists of hundreds of 
 physical nodes.
Reporter: Tupshin Harper
Assignee: Jonathan Ellis
 Attachments: 2013-11-05_18-04-03_no_compression_cpu_time.png, 
 2013-11-05_18-09-38_compression_on_cpu_time.png, 6000vnodes.patch, 
 AdjustableGossipPeriod.patch, delayEstimatorUntilStatisticallyValid.patch


 There are a lot of gossip-related issues related to very wide clusters that 
 also have vnodes enabled. Let's use this ticket as a master in case there are 
 sub-tickets.
 The most obvious symptom I've seen is with 1000 nodes in EC2 with m1.xlarge 
 instances. Each node configured with 32 vnodes.
 Without vnodes, cluster spins up fine and is ready to handle requests within 
 30 minutes or less. 
 With vnodes, nodes are reporting constant up/down flapping messages with no 
 external load on the cluster. After a couple of hours, they were still 
 flapping, had very high cpu load, and the cluster never looked like it was 
 going to stabilize or be useful for traffic.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (CASSANDRA-6127) vnodes don't scale to hundreds of nodes

2013-11-06 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6127?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13815286#comment-13815286
 ] 

Jonathan Ellis commented on CASSANDRA-6127:
---

ISTM that FD processing Gossip updates synchronously is a fundamental problem.  
Any hiccup in processing will cause FD false positives.  (And even if our own 
code is perfect, GC pauses can still do this to us.)

Wouldn't it be better if we:
- time heartbeats based on when they arrive instead of when Gossip processes 
them
- teach FD to recognize that its information is only good up to the most 
recently processed message -- the absence of messages after that doesn't mean 
everyone is down unless the Gossip stage is empty

 vnodes don't scale to hundreds of nodes
 ---

 Key: CASSANDRA-6127
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6127
 Project: Cassandra
  Issue Type: Bug
  Components: Core
 Environment: Any cluster that has vnodes and consists of hundreds of 
 physical nodes.
Reporter: Tupshin Harper
Assignee: Jonathan Ellis
 Attachments: 6000vnodes.patch, AdjustableGossipPeriod.patch, 
 delayEstimatorUntilStatisticallyValid.patch


 There are a lot of gossip-related issues related to very wide clusters that 
 also have vnodes enabled. Let's use this ticket as a master in case there are 
 sub-tickets.
 The most obvious symptom I've seen is with 1000 nodes in EC2 with m1.xlarge 
 instances. Each node configured with 32 vnodes.
 Without vnodes, cluster spins up fine and is ready to handle requests within 
 30 minutes or less. 
 With vnodes, nodes are reporting constant up/down flapping messages with no 
 external load on the cluster. After a couple of hours, they were still 
 flapping, had very high cpu load, and the cluster never looked like it was 
 going to stabilize or be useful for traffic.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (CASSANDRA-6127) vnodes don't scale to hundreds of nodes

2013-11-06 Thread Tupshin Harper (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6127?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13815297#comment-13815297
 ] 

Tupshin Harper commented on CASSANDRA-6127:
---

+1. Strongly agree with Jonathan's analysis and proposal.

 vnodes don't scale to hundreds of nodes
 ---

 Key: CASSANDRA-6127
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6127
 Project: Cassandra
  Issue Type: Bug
  Components: Core
 Environment: Any cluster that has vnodes and consists of hundreds of 
 physical nodes.
Reporter: Tupshin Harper
Assignee: Jonathan Ellis
 Attachments: 6000vnodes.patch, AdjustableGossipPeriod.patch, 
 delayEstimatorUntilStatisticallyValid.patch


 There are a lot of gossip-related issues related to very wide clusters that 
 also have vnodes enabled. Let's use this ticket as a master in case there are 
 sub-tickets.
 The most obvious symptom I've seen is with 1000 nodes in EC2 with m1.xlarge 
 instances. Each node configured with 32 vnodes.
 Without vnodes, cluster spins up fine and is ready to handle requests within 
 30 minutes or less. 
 With vnodes, nodes are reporting constant up/down flapping messages with no 
 external load on the cluster. After a couple of hours, they were still 
 flapping, had very high cpu load, and the cluster never looked like it was 
 going to stabilize or be useful for traffic.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (CASSANDRA-6127) vnodes don't scale to hundreds of nodes

2013-11-06 Thread Brandon Williams (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6127?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13815379#comment-13815379
 ] 

Brandon Williams commented on CASSANDRA-6127:
-

At this point, I think we should:

* see if the flapping happens with vnodes (maybe Quentin already knows from his 
last test)
* see if the flapping happens without vnodes but the same number of nodes

Because if sum() in ArrivalWindow is burning the most CPU in the Gossiper task 
(note: not bottlenecking, each call was at most ~3ms, there were just lots of 
them) then that means that the problem is no longer tied to vnodes (if it ever 
was, since sum is per-node, not per-token) and we should probably open a new 
ticket (can't start a cluster of size =X all at once, or similar) and discuss 
there.  We know that clusters much larger than any discussed on this ticket 
exist, but I don't think any of them have all rebooted at once.

 vnodes don't scale to hundreds of nodes
 ---

 Key: CASSANDRA-6127
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6127
 Project: Cassandra
  Issue Type: Bug
  Components: Core
 Environment: Any cluster that has vnodes and consists of hundreds of 
 physical nodes.
Reporter: Tupshin Harper
Assignee: Jonathan Ellis
 Attachments: 6000vnodes.patch, AdjustableGossipPeriod.patch, 
 delayEstimatorUntilStatisticallyValid.patch


 There are a lot of gossip-related issues related to very wide clusters that 
 also have vnodes enabled. Let's use this ticket as a master in case there are 
 sub-tickets.
 The most obvious symptom I've seen is with 1000 nodes in EC2 with m1.xlarge 
 instances. Each node configured with 32 vnodes.
 Without vnodes, cluster spins up fine and is ready to handle requests within 
 30 minutes or less. 
 With vnodes, nodes are reporting constant up/down flapping messages with no 
 external load on the cluster. After a couple of hours, they were still 
 flapping, had very high cpu load, and the cluster never looked like it was 
 going to stabilize or be useful for traffic.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (CASSANDRA-6127) vnodes don't scale to hundreds of nodes

2013-11-05 Thread Quentin Conner (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6127?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13813928#comment-13813928
 ] 

Quentin Conner commented on CASSANDRA-6127:
---

Good cpu profile results were obtained last night with the 1.2.9 code line.  
Switching over to the cassandra-1.2 HEAD this morning for up-to-date analysis.
CPU profile of 1.2.9 showed bottleneck was computation of sum for the 
ArrivalWindow deque members (inter-arrival times of gossip messages).

 vnodes don't scale to hundreds of nodes
 ---

 Key: CASSANDRA-6127
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6127
 Project: Cassandra
  Issue Type: Bug
  Components: Core
 Environment: Any cluster that has vnodes and consists of hundreds of 
 physical nodes.
Reporter: Tupshin Harper
Assignee: Jonathan Ellis
 Attachments: 6000vnodes.patch, AdjustableGossipPeriod.patch, 
 delayEstimatorUntilStatisticallyValid.patch


 There are a lot of gossip-related issues related to very wide clusters that 
 also have vnodes enabled. Let's use this ticket as a master in case there are 
 sub-tickets.
 The most obvious symptom I've seen is with 1000 nodes in EC2 with m1.xlarge 
 instances. Each node configured with 32 vnodes.
 Without vnodes, cluster spins up fine and is ready to handle requests within 
 30 minutes or less. 
 With vnodes, nodes are reporting constant up/down flapping messages with no 
 external load on the cluster. After a couple of hours, they were still 
 flapping, had very high cpu load, and the cluster never looked like it was 
 going to stabilize or be useful for traffic.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (CASSANDRA-6127) vnodes don't scale to hundreds of nodes

2013-11-04 Thread Matt Stump (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6127?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13813134#comment-13813134
 ] 

Matt Stump commented on CASSANDRA-6127:
---

As another datapoint/use case create a 32 node ring with vnodes, decommission 
one of the nodes and observe the logs. Every node in the ring will be marked as 
down by the gossiper, then immediately be re-added again as up/available.

 vnodes don't scale to hundreds of nodes
 ---

 Key: CASSANDRA-6127
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6127
 Project: Cassandra
  Issue Type: Bug
  Components: Core
 Environment: Any cluster that has vnodes and consists of hundreds of 
 physical nodes.
Reporter: Tupshin Harper
Assignee: Jonathan Ellis
 Attachments: 6000vnodes.patch, AdjustableGossipPeriod.patch, 
 delayEstimatorUntilStatisticallyValid.patch


 There are a lot of gossip-related issues related to very wide clusters that 
 also have vnodes enabled. Let's use this ticket as a master in case there are 
 sub-tickets.
 The most obvious symptom I've seen is with 1000 nodes in EC2 with m1.xlarge 
 instances. Each node configured with 32 vnodes.
 Without vnodes, cluster spins up fine and is ready to handle requests within 
 30 minutes or less. 
 With vnodes, nodes are reporting constant up/down flapping messages with no 
 external load on the cluster. After a couple of hours, they were still 
 flapping, had very high cpu load, and the cluster never looked like it was 
 going to stabilize or be useful for traffic.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (CASSANDRA-6127) vnodes don't scale to hundreds of nodes

2013-11-04 Thread Brandon Williams (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6127?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13813147#comment-13813147
 ] 

Brandon Williams commented on CASSANDRA-6127:
-

bq. Every node in the ring will be marked as down by the gossiper

In which node's view? (or all of them?)

 vnodes don't scale to hundreds of nodes
 ---

 Key: CASSANDRA-6127
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6127
 Project: Cassandra
  Issue Type: Bug
  Components: Core
 Environment: Any cluster that has vnodes and consists of hundreds of 
 physical nodes.
Reporter: Tupshin Harper
Assignee: Jonathan Ellis
 Attachments: 6000vnodes.patch, AdjustableGossipPeriod.patch, 
 delayEstimatorUntilStatisticallyValid.patch


 There are a lot of gossip-related issues related to very wide clusters that 
 also have vnodes enabled. Let's use this ticket as a master in case there are 
 sub-tickets.
 The most obvious symptom I've seen is with 1000 nodes in EC2 with m1.xlarge 
 instances. Each node configured with 32 vnodes.
 Without vnodes, cluster spins up fine and is ready to handle requests within 
 30 minutes or less. 
 With vnodes, nodes are reporting constant up/down flapping messages with no 
 external load on the cluster. After a couple of hours, they were still 
 flapping, had very high cpu load, and the cluster never looked like it was 
 going to stabilize or be useful for traffic.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (CASSANDRA-6127) vnodes don't scale to hundreds of nodes

2013-11-04 Thread Matt Stump (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6127?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13813160#comment-13813160
 ] 

Matt Stump commented on CASSANDRA-6127:
---

We're observing the logs of a random sample of nodes and on all nodes observed 
the entire ring is marked as down, so I assume it's for all nodes.

 vnodes don't scale to hundreds of nodes
 ---

 Key: CASSANDRA-6127
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6127
 Project: Cassandra
  Issue Type: Bug
  Components: Core
 Environment: Any cluster that has vnodes and consists of hundreds of 
 physical nodes.
Reporter: Tupshin Harper
Assignee: Jonathan Ellis
 Attachments: 6000vnodes.patch, AdjustableGossipPeriod.patch, 
 delayEstimatorUntilStatisticallyValid.patch


 There are a lot of gossip-related issues related to very wide clusters that 
 also have vnodes enabled. Let's use this ticket as a master in case there are 
 sub-tickets.
 The most obvious symptom I've seen is with 1000 nodes in EC2 with m1.xlarge 
 instances. Each node configured with 32 vnodes.
 Without vnodes, cluster spins up fine and is ready to handle requests within 
 30 minutes or less. 
 With vnodes, nodes are reporting constant up/down flapping messages with no 
 external load on the cluster. After a couple of hours, they were still 
 flapping, had very high cpu load, and the cluster never looked like it was 
 going to stabilize or be useful for traffic.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (CASSANDRA-6127) vnodes don't scale to hundreds of nodes

2013-11-04 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6127?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13813184#comment-13813184
 ] 

Jonathan Ellis commented on CASSANDRA-6127:
---

How heavy is read/write load?

 vnodes don't scale to hundreds of nodes
 ---

 Key: CASSANDRA-6127
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6127
 Project: Cassandra
  Issue Type: Bug
  Components: Core
 Environment: Any cluster that has vnodes and consists of hundreds of 
 physical nodes.
Reporter: Tupshin Harper
Assignee: Jonathan Ellis
 Attachments: 6000vnodes.patch, AdjustableGossipPeriod.patch, 
 delayEstimatorUntilStatisticallyValid.patch


 There are a lot of gossip-related issues related to very wide clusters that 
 also have vnodes enabled. Let's use this ticket as a master in case there are 
 sub-tickets.
 The most obvious symptom I've seen is with 1000 nodes in EC2 with m1.xlarge 
 instances. Each node configured with 32 vnodes.
 Without vnodes, cluster spins up fine and is ready to handle requests within 
 30 minutes or less. 
 With vnodes, nodes are reporting constant up/down flapping messages with no 
 external load on the cluster. After a couple of hours, they were still 
 flapping, had very high cpu load, and the cluster never looked like it was 
 going to stabilize or be useful for traffic.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (CASSANDRA-6127) vnodes don't scale to hundreds of nodes

2013-11-04 Thread Matt Stump (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6127?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13813198#comment-13813198
 ] 

Matt Stump commented on CASSANDRA-6127:
---

Zero to minimal load. 177 writes/second, 0 reads against the entire ring. 
m2.4xlarge instances.

 vnodes don't scale to hundreds of nodes
 ---

 Key: CASSANDRA-6127
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6127
 Project: Cassandra
  Issue Type: Bug
  Components: Core
 Environment: Any cluster that has vnodes and consists of hundreds of 
 physical nodes.
Reporter: Tupshin Harper
Assignee: Jonathan Ellis
 Attachments: 6000vnodes.patch, AdjustableGossipPeriod.patch, 
 delayEstimatorUntilStatisticallyValid.patch


 There are a lot of gossip-related issues related to very wide clusters that 
 also have vnodes enabled. Let's use this ticket as a master in case there are 
 sub-tickets.
 The most obvious symptom I've seen is with 1000 nodes in EC2 with m1.xlarge 
 instances. Each node configured with 32 vnodes.
 Without vnodes, cluster spins up fine and is ready to handle requests within 
 30 minutes or less. 
 With vnodes, nodes are reporting constant up/down flapping messages with no 
 external load on the cluster. After a couple of hours, they were still 
 flapping, had very high cpu load, and the cluster never looked like it was 
 going to stabilize or be useful for traffic.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (CASSANDRA-6127) vnodes don't scale to hundreds of nodes

2013-11-04 Thread Brandon Williams (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6127?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13813309#comment-13813309
 ] 

Brandon Williams commented on CASSANDRA-6127:
-

With CASSANDRA-6244 and CASSANDRA-6297 in 1.2 head, I think we need to 
re-verify this is still a problem.

 vnodes don't scale to hundreds of nodes
 ---

 Key: CASSANDRA-6127
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6127
 Project: Cassandra
  Issue Type: Bug
  Components: Core
 Environment: Any cluster that has vnodes and consists of hundreds of 
 physical nodes.
Reporter: Tupshin Harper
Assignee: Jonathan Ellis
 Attachments: 6000vnodes.patch, AdjustableGossipPeriod.patch, 
 delayEstimatorUntilStatisticallyValid.patch


 There are a lot of gossip-related issues related to very wide clusters that 
 also have vnodes enabled. Let's use this ticket as a master in case there are 
 sub-tickets.
 The most obvious symptom I've seen is with 1000 nodes in EC2 with m1.xlarge 
 instances. Each node configured with 32 vnodes.
 Without vnodes, cluster spins up fine and is ready to handle requests within 
 30 minutes or less. 
 With vnodes, nodes are reporting constant up/down flapping messages with no 
 external load on the cluster. After a couple of hours, they were still 
 flapping, had very high cpu load, and the cluster never looked like it was 
 going to stabilize or be useful for traffic.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (CASSANDRA-6127) vnodes don't scale to hundreds of nodes

2013-11-01 Thread Quentin Conner (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6127?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13811823#comment-13811823
 ] 

Quentin Conner commented on CASSANDRA-6127:
---

Monday (11/4) I will be start getting the CPU profiling captured with a 256 or 
512 node cluster.  Plan is to capture with internode compression and without.
I was able to get semi-reproduction this week in a 256 node cluster -- one node 
had twice the cpu utilization of the others (20% user versus 10% user).  But I 
had too much logging enabled and that skewed results.


 vnodes don't scale to hundreds of nodes
 ---

 Key: CASSANDRA-6127
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6127
 Project: Cassandra
  Issue Type: Bug
  Components: Core
 Environment: Any cluster that has vnodes and consists of hundreds of 
 physical nodes.
Reporter: Tupshin Harper
Assignee: Jonathan Ellis
 Attachments: 6000vnodes.patch, AdjustableGossipPeriod.patch, 
 delayEstimatorUntilStatisticallyValid.patch


 There are a lot of gossip-related issues related to very wide clusters that 
 also have vnodes enabled. Let's use this ticket as a master in case there are 
 sub-tickets.
 The most obvious symptom I've seen is with 1000 nodes in EC2 with m1.xlarge 
 instances. Each node configured with 32 vnodes.
 Without vnodes, cluster spins up fine and is ready to handle requests within 
 30 minutes or less. 
 With vnodes, nodes are reporting constant up/down flapping messages with no 
 external load on the cluster. After a couple of hours, they were still 
 flapping, had very high cpu load, and the cluster never looked like it was 
 going to stabilize or be useful for traffic.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (CASSANDRA-6127) vnodes don't scale to hundreds of nodes

2013-10-29 Thread Quentin Conner (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6127?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13808260#comment-13808260
 ] 

Quentin Conner commented on CASSANDRA-6127:
---

Brandon,

You said Patch #3 will make it take much longer for a rebooted node to know 
who's actually up or down, exacerbating CASSANDRA-4288.  I've given this some 
thought and want to see if I understand your concern.

Patch #3 serves to send a zero value for phi, for newly-discovered nodes, until 
an accurate calculation of variance is complete.  This would be 40 seconds, 
applicable to new nodes only.

However (and this is what I'm looking for you to confirm) If a new node comes 
online, but is stopped again within 40 seconds of start-up, the FD will not 
convict it until the end of that 40 seconds.

I suspect this occurs less frequently than adding a node to a cluster, but 
probably depends on your use case (dev vs prod).

In my view, we can't escape the math, and the need to amass 40 samples.  That 
is why the bug exists today.  I agree we should look at tying thrift to a 
healthy startup as a compensating measure.

Instead of a fixed amount of time (gossip rounds), perhaps we should consider 
adding a hold-down timer based on a statistical measure?

This hold-down timer could be implemented for newly discovered nodes to 
suppress interaction until Gossip stabilizes.  Just like we have a high-water 
mark for phi to denote failure, we could set a low-water mark and call it a 
trust threshold.  We wouldn't enable thrift communications to the new node 
until their phi value is below this low-water mark.

So the condition for recognizing a new node for thrift purposes could be two 
fold:
  1.  valid computation for variance (40 samples obtained in the 1000 sample 
window)
  2.  accurate phi value is indeed below the low-water mark

 vnodes don't scale to hundreds of nodes
 ---

 Key: CASSANDRA-6127
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6127
 Project: Cassandra
  Issue Type: Bug
  Components: Core
 Environment: Any cluster that has vnodes and consists of hundreds of 
 physical nodes.
Reporter: Tupshin Harper
Assignee: Jonathan Ellis
 Attachments: 6000vnodes.patch, AdjustableGossipPeriod.patch, 
 delayEstimatorUntilStatisticallyValid.patch


 There are a lot of gossip-related issues related to very wide clusters that 
 also have vnodes enabled. Let's use this ticket as a master in case there are 
 sub-tickets.
 The most obvious symptom I've seen is with 1000 nodes in EC2 with m1.xlarge 
 instances. Each node configured with 32 vnodes.
 Without vnodes, cluster spins up fine and is ready to handle requests within 
 30 minutes or less. 
 With vnodes, nodes are reporting constant up/down flapping messages with no 
 external load on the cluster. After a couple of hours, they were still 
 flapping, had very high cpu load, and the cluster never looked like it was 
 going to stabilize or be useful for traffic.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (CASSANDRA-6127) vnodes don't scale to hundreds of nodes

2013-10-29 Thread Brandon Williams (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6127?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13808328#comment-13808328
 ] 

Brandon Williams commented on CASSANDRA-6127:
-

Let's move that discussion to CASSANDRA-4288, since that change is orthogonal 
to the actual problem we have here, regardless of whether it fixes it or just 
papers over the problem.  What we need to do next on this ticket is either 
correlate a thread dump to what is burning up CPU, or attach a debugger and see 
where the time is being spent.

 vnodes don't scale to hundreds of nodes
 ---

 Key: CASSANDRA-6127
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6127
 Project: Cassandra
  Issue Type: Bug
  Components: Core
 Environment: Any cluster that has vnodes and consists of hundreds of 
 physical nodes.
Reporter: Tupshin Harper
Assignee: Jonathan Ellis
 Attachments: 6000vnodes.patch, AdjustableGossipPeriod.patch, 
 delayEstimatorUntilStatisticallyValid.patch


 There are a lot of gossip-related issues related to very wide clusters that 
 also have vnodes enabled. Let's use this ticket as a master in case there are 
 sub-tickets.
 The most obvious symptom I've seen is with 1000 nodes in EC2 with m1.xlarge 
 instances. Each node configured with 32 vnodes.
 Without vnodes, cluster spins up fine and is ready to handle requests within 
 30 minutes or less. 
 With vnodes, nodes are reporting constant up/down flapping messages with no 
 external load on the cluster. After a couple of hours, they were still 
 flapping, had very high cpu load, and the cluster never looked like it was 
 going to stabilize or be useful for traffic.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (CASSANDRA-6127) vnodes don't scale to hundreds of nodes

2013-10-29 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6127?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13808335#comment-13808335
 ] 

Jonathan Ellis commented on CASSANDRA-6127:
---

bq. it would be better to limit that in the config instead of failing at an 
assert later on.

Split that out to CASSANDRA-6267.

 vnodes don't scale to hundreds of nodes
 ---

 Key: CASSANDRA-6127
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6127
 Project: Cassandra
  Issue Type: Bug
  Components: Core
 Environment: Any cluster that has vnodes and consists of hundreds of 
 physical nodes.
Reporter: Tupshin Harper
Assignee: Jonathan Ellis
 Attachments: 6000vnodes.patch, AdjustableGossipPeriod.patch, 
 delayEstimatorUntilStatisticallyValid.patch


 There are a lot of gossip-related issues related to very wide clusters that 
 also have vnodes enabled. Let's use this ticket as a master in case there are 
 sub-tickets.
 The most obvious symptom I've seen is with 1000 nodes in EC2 with m1.xlarge 
 instances. Each node configured with 32 vnodes.
 Without vnodes, cluster spins up fine and is ready to handle requests within 
 30 minutes or less. 
 With vnodes, nodes are reporting constant up/down flapping messages with no 
 external load on the cluster. After a couple of hours, they were still 
 flapping, had very high cpu load, and the cluster never looked like it was 
 going to stabilize or be useful for traffic.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (CASSANDRA-6127) vnodes don't scale to hundreds of nodes

2013-10-28 Thread Chris Burroughs (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6127?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13806703#comment-13806703
 ] 

Chris Burroughs commented on CASSANDRA-6127:


bq.  I'd just set a max of 1024. No one could ever need more than that. (Famous 
last words)

Isn't that equivalent to saying no one will have a heterogeneous cluster with 
more  than a 1024/256 = 4 performance delta between physical nodes?  SSD vs 
spinny could account for more than that. 

 vnodes don't scale to hundreds of nodes
 ---

 Key: CASSANDRA-6127
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6127
 Project: Cassandra
  Issue Type: Bug
  Components: Core
 Environment: Any cluster that has vnodes and consists of hundreds of 
 physical nodes.
Reporter: Tupshin Harper
Assignee: Jonathan Ellis
 Attachments: 6000vnodes.patch, AdjustableGossipPeriod.patch, 
 delayEstimatorUntilStatisticallyValid.patch


 There are a lot of gossip-related issues related to very wide clusters that 
 also have vnodes enabled. Let's use this ticket as a master in case there are 
 sub-tickets.
 The most obvious symptom I've seen is with 1000 nodes in EC2 with m1.xlarge 
 instances. Each node configured with 32 vnodes.
 Without vnodes, cluster spins up fine and is ready to handle requests within 
 30 minutes or less. 
 With vnodes, nodes are reporting constant up/down flapping messages with no 
 external load on the cluster. After a couple of hours, they were still 
 flapping, had very high cpu load, and the cluster never looked like it was 
 going to stabilize or be useful for traffic.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (CASSANDRA-6127) vnodes don't scale to hundreds of nodes

2013-10-28 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6127?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13806796#comment-13806796
 ] 

Jonathan Ellis commented on CASSANDRA-6127:
---

I'm okay with that limitation.  Intuitively it's reasonable that C* can't 
compensate for really ridiculous performance differences.

(Of course, you could also reduce the weak nodes below 256.)

 vnodes don't scale to hundreds of nodes
 ---

 Key: CASSANDRA-6127
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6127
 Project: Cassandra
  Issue Type: Bug
  Components: Core
 Environment: Any cluster that has vnodes and consists of hundreds of 
 physical nodes.
Reporter: Tupshin Harper
Assignee: Jonathan Ellis
 Attachments: 6000vnodes.patch, AdjustableGossipPeriod.patch, 
 delayEstimatorUntilStatisticallyValid.patch


 There are a lot of gossip-related issues related to very wide clusters that 
 also have vnodes enabled. Let's use this ticket as a master in case there are 
 sub-tickets.
 The most obvious symptom I've seen is with 1000 nodes in EC2 with m1.xlarge 
 instances. Each node configured with 32 vnodes.
 Without vnodes, cluster spins up fine and is ready to handle requests within 
 30 minutes or less. 
 With vnodes, nodes are reporting constant up/down flapping messages with no 
 external load on the cluster. After a couple of hours, they were still 
 flapping, had very high cpu load, and the cluster never looked like it was 
 going to stabilize or be useful for traffic.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (CASSANDRA-6127) vnodes don't scale to hundreds of nodes

2013-10-25 Thread Chris Burroughs (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6127?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13805267#comment-13805267
 ] 

Chris Burroughs commented on CASSANDRA-6127:


 It would be helpful to dump the interval times for a node that is flapping 
 (dumpInterArrivalTimes on the FD) so we can see how long the heartbeats are 
 taking.

A per endpoint histogram of heartbeat arrival latency seems a worthwhile 
o.a.c.Metric to have all the time.

[~qconner]  On the topic of wait until there is enough data before doing 
stuff you might also be interested in the heuristic  report from 
CASSANDRA-4288

 vnodes don't scale to hundreds of nodes
 ---

 Key: CASSANDRA-6127
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6127
 Project: Cassandra
  Issue Type: Bug
  Components: Core
 Environment: Any cluster that has vnodes and consists of hundreds of 
 physical nodes.
Reporter: Tupshin Harper
Assignee: Jonathan Ellis
 Attachments: 6000vnodes.patch, AdjustableGossipPeriod.patch, 
 delayEstimatorUntilStatisticallyValid.patch


 There are a lot of gossip-related issues related to very wide clusters that 
 also have vnodes enabled. Let's use this ticket as a master in case there are 
 sub-tickets.
 The most obvious symptom I've seen is with 1000 nodes in EC2 with m1.xlarge 
 instances. Each node configured with 32 vnodes.
 Without vnodes, cluster spins up fine and is ready to handle requests within 
 30 minutes or less. 
 With vnodes, nodes are reporting constant up/down flapping messages with no 
 external load on the cluster. After a couple of hours, they were still 
 flapping, had very high cpu load, and the cluster never looked like it was 
 going to stabilize or be useful for traffic.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (CASSANDRA-6127) vnodes don't scale to hundreds of nodes

2013-10-25 Thread Quentin Conner (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6127?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13805838#comment-13805838
 ] 

Quentin Conner commented on CASSANDRA-6127:
---

I grabbed some sample log files from 10 nodes of 256 in a run today.  
[flap-intervals.tar.gz|http://qconner.s3.amazonaws.com/flap-intervals.tar.gz]

Convictions are happening with only 1 to 5 intervals recorded.  Patch #3 is 
looking like the winner but we should do the math by hand to be sure 
(volunteers?).

Also, I just tested [Patch 
#3|https://issues.apache.org/jira/secure/attachment/12610117/delayEstimatorUntilStatisticallyValid.patch]
 and found 0 flaps for the same setup as yesterday (256 nodes, phi=8, normal 
1000 ms gossip period).




 vnodes don't scale to hundreds of nodes
 ---

 Key: CASSANDRA-6127
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6127
 Project: Cassandra
  Issue Type: Bug
  Components: Core
 Environment: Any cluster that has vnodes and consists of hundreds of 
 physical nodes.
Reporter: Tupshin Harper
Assignee: Jonathan Ellis
 Attachments: 6000vnodes.patch, AdjustableGossipPeriod.patch, 
 delayEstimatorUntilStatisticallyValid.patch


 There are a lot of gossip-related issues related to very wide clusters that 
 also have vnodes enabled. Let's use this ticket as a master in case there are 
 sub-tickets.
 The most obvious symptom I've seen is with 1000 nodes in EC2 with m1.xlarge 
 instances. Each node configured with 32 vnodes.
 Without vnodes, cluster spins up fine and is ready to handle requests within 
 30 minutes or less. 
 With vnodes, nodes are reporting constant up/down flapping messages with no 
 external load on the cluster. After a couple of hours, they were still 
 flapping, had very high cpu load, and the cluster never looked like it was 
 going to stabilize or be useful for traffic.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (CASSANDRA-6127) vnodes don't scale to hundreds of nodes

2013-10-25 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6127?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13805880#comment-13805880
 ] 

Jonathan Ellis commented on CASSANDRA-6127:
---

Patch 1 will break things since later on we write the length of the string as 
two bytes.

I think we're fine with 1700 vnodes per machine TBH, although it would be 
better to limit that in the config instead of failing at an assert later on.

 vnodes don't scale to hundreds of nodes
 ---

 Key: CASSANDRA-6127
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6127
 Project: Cassandra
  Issue Type: Bug
  Components: Core
 Environment: Any cluster that has vnodes and consists of hundreds of 
 physical nodes.
Reporter: Tupshin Harper
Assignee: Jonathan Ellis
 Attachments: 6000vnodes.patch, AdjustableGossipPeriod.patch, 
 delayEstimatorUntilStatisticallyValid.patch


 There are a lot of gossip-related issues related to very wide clusters that 
 also have vnodes enabled. Let's use this ticket as a master in case there are 
 sub-tickets.
 The most obvious symptom I've seen is with 1000 nodes in EC2 with m1.xlarge 
 instances. Each node configured with 32 vnodes.
 Without vnodes, cluster spins up fine and is ready to handle requests within 
 30 minutes or less. 
 With vnodes, nodes are reporting constant up/down flapping messages with no 
 external load on the cluster. After a couple of hours, they were still 
 flapping, had very high cpu load, and the cluster never looked like it was 
 going to stabilize or be useful for traffic.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (CASSANDRA-6127) vnodes don't scale to hundreds of nodes

2013-10-25 Thread Tupshin Harper (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6127?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13805890#comment-13805890
 ] 

Tupshin Harper commented on CASSANDRA-6127:
---

I'd just set a max of 1024. No one could ever need more than that. (Famous
last words)



 vnodes don't scale to hundreds of nodes
 ---

 Key: CASSANDRA-6127
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6127
 Project: Cassandra
  Issue Type: Bug
  Components: Core
 Environment: Any cluster that has vnodes and consists of hundreds of 
 physical nodes.
Reporter: Tupshin Harper
Assignee: Jonathan Ellis
 Attachments: 6000vnodes.patch, AdjustableGossipPeriod.patch, 
 delayEstimatorUntilStatisticallyValid.patch


 There are a lot of gossip-related issues related to very wide clusters that 
 also have vnodes enabled. Let's use this ticket as a master in case there are 
 sub-tickets.
 The most obvious symptom I've seen is with 1000 nodes in EC2 with m1.xlarge 
 instances. Each node configured with 32 vnodes.
 Without vnodes, cluster spins up fine and is ready to handle requests within 
 30 minutes or less. 
 With vnodes, nodes are reporting constant up/down flapping messages with no 
 external load on the cluster. After a couple of hours, they were still 
 flapping, had very high cpu load, and the cluster never looked like it was 
 going to stabilize or be useful for traffic.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (CASSANDRA-6127) vnodes don't scale to hundreds of nodes

2013-10-25 Thread Brandon Williams (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6127?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13805928#comment-13805928
 ] 

Brandon Williams commented on CASSANDRA-6127:
-

Patch #3 will make it take much longer for a rebooted node to know who's 
actually up or down, exacerbating CASSANDRA-4288.  I'd still like to know *why* 
things are taking longer with vnodes, and I'm especially hesitant to make any 
adjustments to the gossiper or FD since we know they work fine with single 
tokens, and also because they *have no knowledge about tokens*, it's just 
another opaque state to them.  I suspect something in StorageService is 
blocking the gossiper long enough to cause this, perhaps CASSANDRA-6244 or 
something similar.

 vnodes don't scale to hundreds of nodes
 ---

 Key: CASSANDRA-6127
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6127
 Project: Cassandra
  Issue Type: Bug
  Components: Core
 Environment: Any cluster that has vnodes and consists of hundreds of 
 physical nodes.
Reporter: Tupshin Harper
Assignee: Jonathan Ellis
 Attachments: 6000vnodes.patch, AdjustableGossipPeriod.patch, 
 delayEstimatorUntilStatisticallyValid.patch


 There are a lot of gossip-related issues related to very wide clusters that 
 also have vnodes enabled. Let's use this ticket as a master in case there are 
 sub-tickets.
 The most obvious symptom I've seen is with 1000 nodes in EC2 with m1.xlarge 
 instances. Each node configured with 32 vnodes.
 Without vnodes, cluster spins up fine and is ready to handle requests within 
 30 minutes or less. 
 With vnodes, nodes are reporting constant up/down flapping messages with no 
 external load on the cluster. After a couple of hours, they were still 
 flapping, had very high cpu load, and the cluster never looked like it was 
 going to stabilize or be useful for traffic.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (CASSANDRA-6127) vnodes don't scale to hundreds of nodes

2013-10-25 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6127?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13805977#comment-13805977
 ] 

Jonathan Ellis commented on CASSANDRA-6127:
---

Couldn't we tie the thrift/native server startup to I have enough gossip data 
now?

 vnodes don't scale to hundreds of nodes
 ---

 Key: CASSANDRA-6127
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6127
 Project: Cassandra
  Issue Type: Bug
  Components: Core
 Environment: Any cluster that has vnodes and consists of hundreds of 
 physical nodes.
Reporter: Tupshin Harper
Assignee: Jonathan Ellis
 Attachments: 6000vnodes.patch, AdjustableGossipPeriod.patch, 
 delayEstimatorUntilStatisticallyValid.patch


 There are a lot of gossip-related issues related to very wide clusters that 
 also have vnodes enabled. Let's use this ticket as a master in case there are 
 sub-tickets.
 The most obvious symptom I've seen is with 1000 nodes in EC2 with m1.xlarge 
 instances. Each node configured with 32 vnodes.
 Without vnodes, cluster spins up fine and is ready to handle requests within 
 30 minutes or less. 
 With vnodes, nodes are reporting constant up/down flapping messages with no 
 external load on the cluster. After a couple of hours, they were still 
 flapping, had very high cpu load, and the cluster never looked like it was 
 going to stabilize or be useful for traffic.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (CASSANDRA-6127) vnodes don't scale to hundreds of nodes

2013-10-25 Thread Brandon Williams (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6127?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13805981#comment-13805981
 ] 

Brandon Williams commented on CASSANDRA-6127:
-

That might confuse autodiscovery clients, at least without further changes.

 vnodes don't scale to hundreds of nodes
 ---

 Key: CASSANDRA-6127
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6127
 Project: Cassandra
  Issue Type: Bug
  Components: Core
 Environment: Any cluster that has vnodes and consists of hundreds of 
 physical nodes.
Reporter: Tupshin Harper
Assignee: Jonathan Ellis
 Attachments: 6000vnodes.patch, AdjustableGossipPeriod.patch, 
 delayEstimatorUntilStatisticallyValid.patch


 There are a lot of gossip-related issues related to very wide clusters that 
 also have vnodes enabled. Let's use this ticket as a master in case there are 
 sub-tickets.
 The most obvious symptom I've seen is with 1000 nodes in EC2 with m1.xlarge 
 instances. Each node configured with 32 vnodes.
 Without vnodes, cluster spins up fine and is ready to handle requests within 
 30 minutes or less. 
 With vnodes, nodes are reporting constant up/down flapping messages with no 
 external load on the cluster. After a couple of hours, they were still 
 flapping, had very high cpu load, and the cluster never looked like it was 
 going to stabilize or be useful for traffic.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (CASSANDRA-6127) vnodes don't scale to hundreds of nodes

2013-10-24 Thread Quentin Conner (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6127?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13804361#comment-13804361
 ] 

Quentin Conner commented on CASSANDRA-6127:
---

*Background and Reproduction*

The symptom is evident with the presence of is now DOWN messages in the 
Cassandra system.log file.  The recording of a node DOWN is often followed by a 
node UP a few seconds later.  Users have coined this phenomenon gossip flap 
and the occurence of Gossip flaps has a machine and a human consequence.

Humans react strongly to the (temporary) marking of a node down.  Automated 
monitoring may trigger SNMP traps, etc.  A busy node that doesn't transmit 
heartbeat gossip messages on time will be marked as down though it may still 
be performing useful work.

Machine reactions include other C* nodes buffering of row mutations and storage 
of hints on disk when another node is marked down.  I have not explored the 
machine reactions but imagine the endpointSnitch could also be affected from 
the client frame of reference.

One piece of good news is that I was able to reproduce two different use cases 
that elicit the is now DOWN message in Log4J log files.

Use Case #1 is as follows:
  provision 256 or 512 nodes in EC2
  install Cassandra 1.2.9
  take defaults except specify num_tokens=256 in c*.yaml
  start one node at a time

Use Case #2 is as follows:
  provision 32 nodes in EC2
  install Cassandra 1.2.9
  take defaults in c*.yaml
  configure rack
  start one node at a time
  when all nodes are up create about 1GB of data
e.g. tools/bin/cassandra-stress -c 20 -l 3 -n 100
  provision a 33rdxtra node in EC2
  install Cassandra 1.2.9
  take defaults except specify num_tokens=256
  start the node (auto_bootstrap=true)




 vnodes don't scale to hundreds of nodes
 ---

 Key: CASSANDRA-6127
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6127
 Project: Cassandra
  Issue Type: Bug
  Components: Core
 Environment: Any cluster that has vnodes and consists of hundreds of 
 physical nodes.
Reporter: Tupshin Harper
Assignee: Jonathan Ellis

 There are a lot of gossip-related issues related to very wide clusters that 
 also have vnodes enabled. Let's use this ticket as a master in case there are 
 sub-tickets.
 The most obvious symptom I've seen is with 1000 nodes in EC2 with m1.xlarge 
 instances. Each node configured with 32 vnodes.
 Without vnodes, cluster spins up fine and is ready to handle requests within 
 30 minutes or less. 
 With vnodes, nodes are reporting constant up/down flapping messages with no 
 external load on the cluster. After a couple of hours, they were still 
 flapping, had very high cpu load, and the cluster never looked like it was 
 going to stabilize or be useful for traffic.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (CASSANDRA-6127) vnodes don't scale to hundreds of nodes

2013-10-24 Thread Quentin Conner (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6127?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13804365#comment-13804365
 ] 

Quentin Conner commented on CASSANDRA-6127:
---

*Feature Suggestion*

The current Gossip failure detector is characterized by a sliding window of 
elapsed time, a heartbeat message period and a PHI threshold used to make the 
continuous random variable (lower case phi) into a dichotomous (binary) random 
variable.  That PHI (uppercase) threshold is called phi_convict_threshold.

I don't have a better mathmatical theory or derivation at this writing, but I 
do have an easy workaround for your consideration.  While phi_convict_threshold 
is adjustable, the period (or frequency) of Gossip messages is not.  Adjusting 
the gossip period to integrate over a longer time baseline reduced false 
positives from the Gossip failure detector.  The side effect increases the 
elapsed time to detect a legitimately-failed node.

Depending on user workload characteristics, and the related sources of latency 
(CPU, disk and network activity or transient delays) cited above, a System 
Architect could present a reasonable use case for controlling the Gossip 
message period.

The goal would be to set a detection window that accomodates common occurences 
for a given deployment scenario.  Not all data centers are created equal.

Patches and results from implementation will follow in subsequent posts.

*Potential Next Steps*
  Explore concern about sensitivity to gossip period.  Do the vnode gossip 
messages exceed capacity for peers to ingest?
  Explore concern about phi estimates from un-filled (new) deque.
  Explore concern about assuming Gaussian PDF.  Networks (not computers) 
generally characterize expected arrival time by Poisson distribution, not 
Gaussian.


 vnodes don't scale to hundreds of nodes
 ---

 Key: CASSANDRA-6127
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6127
 Project: Cassandra
  Issue Type: Bug
  Components: Core
 Environment: Any cluster that has vnodes and consists of hundreds of 
 physical nodes.
Reporter: Tupshin Harper
Assignee: Jonathan Ellis

 There are a lot of gossip-related issues related to very wide clusters that 
 also have vnodes enabled. Let's use this ticket as a master in case there are 
 sub-tickets.
 The most obvious symptom I've seen is with 1000 nodes in EC2 with m1.xlarge 
 instances. Each node configured with 32 vnodes.
 Without vnodes, cluster spins up fine and is ready to handle requests within 
 30 minutes or less. 
 With vnodes, nodes are reporting constant up/down flapping messages with no 
 external load on the cluster. After a couple of hours, they were still 
 flapping, had very high cpu load, and the cluster never looked like it was 
 going to stabilize or be useful for traffic.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (CASSANDRA-6127) vnodes don't scale to hundreds of nodes

2013-10-24 Thread Quentin Conner (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6127?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13804362#comment-13804362
 ] 

Quentin Conner commented on CASSANDRA-6127:
---

*Analysis*

My first experiments aimed to quantify the length of Gossip messages and 
determine what factors drive the message length.  I found the size of certain 
gossip messages increases proportionally with the number of vnodes (num_tokens 
in c.yaml).  I recorded message size over the num_tokens and number of nodes 
domains (64,128,256,512,...) for tokens and (32,64,128,256,512) for nodes.  I 
also made non-rigorous observation of User and Kernel CPU (Ubuntu 10.0.4 LTS).  
My hunch is that both vnode count and node count have a mild effect on user CPU 
resource usage.

What is the rough estimate of bytes sent for certain Gossip messages and why 
does this matter?  The Phi Accrual Failure Detector (Hayashibara, et al) 
assumes fixed length heartbeat messages while Cassandra uses variable length 
messages.  I observed a correlation with larger messages, higher vnodes and 
false positive detections by the Gossip FailureDetector.  These observations, 
IMHO, are not explained by the research paper.  I formed a hypothesis that the 
false positives are due to jitter in the interval values.  I wondered if 
perhaps using a longer baseline to integrate over would reduce the jitter.  

I have a second theory to follow up on.  A newly added node will not have a 
long history of Gossip heartbeat interarrival times.  At least 40 samples are 
needed to compute mean, variance with any statistical significance.  It's 
possible the phi estimation algorithm is simply invalid for newly created nodes 
and that is why we see them flap shortly after creation.

In any case, the message of interest is the GossipDigestAck2 (GDA2) because it 
is the largest of the Gossip messages.  GDA2 contains the set of 
EndpointStateMaps (node metadata) for newly-discovered nodes, i.e. those nodes 
just added to an existing cluster.  When each node becomes aware of joining 
node, they Gossip it to three randomly-chosen other nodes.  The GDA2 message is 
tailored to contain the delta of new node metadata the receiving node is 
unaware of.

For a single node, the upper limit on GDA message size is roughly 3 * N * k * V
Where N is the number of nodes in the cluster,
V is the number of tokens (vnodes) per cluster,
k is a constant value, approximately 64 bytes, that represents a serialized 
token plus some other endpoint metadata.

If one is running hundreds of nodes in a cluster, the Gossip message traffic 
created when a node joins can be significant and increases with the number of 
nodes.  I believe this to be the first order effect and probably violates one 
of the assumptions of the PHI Accrual Failure Detection, that heartbeat 
messages are small enough not to consume a relevant amount of compute or 
communication resources.  The variable transmission time (due to variable 
length messages) is a clear violation of assumptions, if I've read the source 
code correctly.

On a related topic, there is a hard-coded limitation to the number of vnodes 
due to the serialization of the GDA messages.
No more than 1720 vnodes can be configured without creating a greater than 32K 
serialized String vnode message.  A patch is provided below for future use 
should this become an issue.

In clusters with hundreds of nodes, GDA2 messages can be 200 KB or 2 MB if many 
nodes join simultaneously.  This is not an issue if the computer experiences no 
latency from competing workloads.  In the real world, nodes are added because 
the cluster load has grown in terms of retained data, or in terms of a high 
transaction arrival rate.  This means node resources may be fully utilized when 
adding new nodes is typically attempted.

It occured to me that we have another use case to accomodate.  It is common to 
experience transient failure modes, even in modern data centers with 
disciplined maintenance practices.  Ethernet cables get moved, switches and 
routers rebooted.  BGP route errors and other temporary interruptions may occur 
with the network fabric in real world scenarios.  People make mistakes, plans 
change and preventative maintenance often causes short-lived interruptions 
occur with network, CPU and disk subsystems.


 vnodes don't scale to hundreds of nodes
 ---

 Key: CASSANDRA-6127
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6127
 Project: Cassandra
  Issue Type: Bug
  Components: Core
 Environment: Any cluster that has vnodes and consists of hundreds of 
 physical nodes.
Reporter: Tupshin Harper
Assignee: Jonathan Ellis

 There are a lot of gossip-related issues related to very wide clusters that 
 also have vnodes enabled. Let's use this ticket as a 

[jira] [Commented] (CASSANDRA-6127) vnodes don't scale to hundreds of nodes

2013-10-24 Thread Quentin Conner (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6127?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13804459#comment-13804459
 ] 

Quentin Conner commented on CASSANDRA-6127:
---

First results with workaround patch #2.
No load.  No data.  Only system keyspace and Gossip on a 256 node m1.medium 
cluster in EC2.
Nodes started in rapid succession.

*phi=8, variable gossip_period*
1154 flaps for 1 sec
685 flaps for 2 sec
146 flaps for 3 sec
88 flaps for 4 sec
70 flaps for 5 sec
100 flaps for 10 sec

*phi=12*
1289 flaps for 1 sec
77 flaps for 2 sec
6 flaps for 3 sec
1 flaps for 4 sec
3 flaps for 5 sec
1 flaps for 6 sec
0 flaps for 8 sec
1 flaps for 10 sec


 vnodes don't scale to hundreds of nodes
 ---

 Key: CASSANDRA-6127
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6127
 Project: Cassandra
  Issue Type: Bug
  Components: Core
 Environment: Any cluster that has vnodes and consists of hundreds of 
 physical nodes.
Reporter: Tupshin Harper
Assignee: Jonathan Ellis
 Attachments: 6000vnodes.patch, AdjustableGossipPeriod.patch, 
 delayEstimatorUntilStatisticallyValid.patch


 There are a lot of gossip-related issues related to very wide clusters that 
 also have vnodes enabled. Let's use this ticket as a master in case there are 
 sub-tickets.
 The most obvious symptom I've seen is with 1000 nodes in EC2 with m1.xlarge 
 instances. Each node configured with 32 vnodes.
 Without vnodes, cluster spins up fine and is ready to handle requests within 
 30 minutes or less. 
 With vnodes, nodes are reporting constant up/down flapping messages with no 
 external load on the cluster. After a couple of hours, they were still 
 flapping, had very high cpu load, and the cluster never looked like it was 
 going to stabilize or be useful for traffic.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (CASSANDRA-6127) vnodes don't scale to hundreds of nodes

2013-10-24 Thread Brandon Williams (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6127?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13804545#comment-13804545
 ] 

Brandon Williams commented on CASSANDRA-6127:
-

It would be helpful to dump the interval times for a node that is flapping 
(dumpInterArrivalTimes on the FD) so we can see how long the heartbeats are 
taking.  If some are excessively long, we need to get threads dumps/debugger 
timings from the gossiper to see if something is blocking it or taking a long 
time before changing any fundamentals (gossip interval, FD formula) that we 
already know work in principle without vnodes.  Increasing the payload size to 
32k shouldn't cause these problems, since that is only sent during initial 
state synchronization and isn't all that large to begin with.

 vnodes don't scale to hundreds of nodes
 ---

 Key: CASSANDRA-6127
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6127
 Project: Cassandra
  Issue Type: Bug
  Components: Core
 Environment: Any cluster that has vnodes and consists of hundreds of 
 physical nodes.
Reporter: Tupshin Harper
Assignee: Jonathan Ellis
 Attachments: 6000vnodes.patch, AdjustableGossipPeriod.patch, 
 delayEstimatorUntilStatisticallyValid.patch


 There are a lot of gossip-related issues related to very wide clusters that 
 also have vnodes enabled. Let's use this ticket as a master in case there are 
 sub-tickets.
 The most obvious symptom I've seen is with 1000 nodes in EC2 with m1.xlarge 
 instances. Each node configured with 32 vnodes.
 Without vnodes, cluster spins up fine and is ready to handle requests within 
 30 minutes or less. 
 With vnodes, nodes are reporting constant up/down flapping messages with no 
 external load on the cluster. After a couple of hours, they were still 
 flapping, had very high cpu load, and the cluster never looked like it was 
 going to stabilize or be useful for traffic.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (CASSANDRA-6127) vnodes don't scale to hundreds of nodes

2013-10-24 Thread Brandon Williams (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6127?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13804770#comment-13804770
 ] 

Brandon Williams commented on CASSANDRA-6127:
-

Can you see if adding -Dcassandra.unsafesystem=true allows the cluster to 
stabilize at some point?

 vnodes don't scale to hundreds of nodes
 ---

 Key: CASSANDRA-6127
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6127
 Project: Cassandra
  Issue Type: Bug
  Components: Core
 Environment: Any cluster that has vnodes and consists of hundreds of 
 physical nodes.
Reporter: Tupshin Harper
Assignee: Jonathan Ellis
 Attachments: 6000vnodes.patch, AdjustableGossipPeriod.patch, 
 delayEstimatorUntilStatisticallyValid.patch


 There are a lot of gossip-related issues related to very wide clusters that 
 also have vnodes enabled. Let's use this ticket as a master in case there are 
 sub-tickets.
 The most obvious symptom I've seen is with 1000 nodes in EC2 with m1.xlarge 
 instances. Each node configured with 32 vnodes.
 Without vnodes, cluster spins up fine and is ready to handle requests within 
 30 minutes or less. 
 With vnodes, nodes are reporting constant up/down flapping messages with no 
 external load on the cluster. After a couple of hours, they were still 
 flapping, had very high cpu load, and the cluster never looked like it was 
 going to stabilize or be useful for traffic.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (CASSANDRA-6127) vnodes don't scale to hundreds of nodes

2013-10-01 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6127?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13783372#comment-13783372
 ] 

Jonathan Ellis commented on CASSANDRA-6127:
---

bq. After a couple of hours, they were still flapping, had very high cpu load

To clarify, this is a bit of a mashup of multiple observations:

bq. When there was zero traffic on the cluster, we were seeing flapping without 
very high cpu. On smaller tests, we saw much higher cpu than expected when 
under load.

 vnodes don't scale to hundreds of nodes
 ---

 Key: CASSANDRA-6127
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6127
 Project: Cassandra
  Issue Type: Bug
  Components: Core
 Environment: Any cluster that has vnodes and consists of hundreds of 
 physical nodes.
Reporter: Tupshin Harper
Assignee: Jonathan Ellis

 There are a lot of gossip-related issues related to very wide clusters that 
 also have vnodes enabled. Let's use this ticket as a master in case there are 
 sub-tickets.
 The most obvious symptom I've seen is with 1000 nodes in EC2 with m1.xlarge 
 instances. Each node configured with 32 vnodes.
 Without vnodes, cluster spins up fine and is ready to handle requests within 
 30 minutes or less. 
 With vnodes, nodes are reporting constant up/down flapping messages with no 
 external load on the cluster. After a couple of hours, they were still 
 flapping, had very high cpu load, and the cluster never looked like it was 
 going to stabilize or be useful for traffic.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (CASSANDRA-6127) vnodes don't scale to hundreds of nodes

2013-10-01 Thread Darla Baker (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6127?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13783481#comment-13783481
 ] 

Darla Baker commented on CASSANDRA-6127:


Per Jonathan's request, I'm adding an update here regarding eBay's experience 
on https://support.datastax.com/tickets/6928 which was the result of first 
stage of executing the plan from https://support.datastax.com/requests/6636.

He had an existing 32 cluster DSE 3.1.0 cluster in their PHX data center.  
Their plan was to add a second data center to the cluster in SLC with 50 nodes 
and vnodes enabled.  They were to begin with bringing all nodes up with auto 
bootstrapping turned off to prevent any data streaming until they were ready to 
make other changes to bring the data center fully online.

Essentially immediately upon bringing the nodes up in SLC, the nodes in PHX 
began reporting as down and he began receiving SMS messages and calls from 
application engineers that the application which uses that cluster was down.

As we were in triage mode, the most expedient course of action was to shut down 
the SLC nodes and remove them from gossip.  Upon trying to execute the nodetool 
removenode command we hit CASSANDRA-5857 although we thought up to this point 
that nodetool decommission was responsible for the issue.  In any case, we 
started the process of executing the workaround as per that ticket.  At the 
point we parted, the process was going slowly but he reported it was working 
and the nodes were disappearing from the ring and the application engineers 
were reporting that the application was back online.

At some point during the weekend, Alex reached out to Jeremy who was on call 
and Jeremy who was able to finally get the nodes removed from gossip and fully 
stabilize the 32 node PHX data center and fully decommission the SLC data 
center.

Alex attached some logs to the ticket during the event.  We were seeing node 
flapping and NPEs during the event.

Ticket https://support.datastax.com/tickets/6917 contains some additional 
details on the test cases.

Ticket https://support.datastax.com/tickets/6939 contains the alternate plan 
that eBay is considering in light of the difficulties encountered with bringing 
SLC online.

 vnodes don't scale to hundreds of nodes
 ---

 Key: CASSANDRA-6127
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6127
 Project: Cassandra
  Issue Type: Bug
  Components: Core
 Environment: Any cluster that has vnodes and consists of hundreds of 
 physical nodes.
Reporter: Tupshin Harper
Assignee: Jonathan Ellis

 There are a lot of gossip-related issues related to very wide clusters that 
 also have vnodes enabled. Let's use this ticket as a master in case there are 
 sub-tickets.
 The most obvious symptom I've seen is with 1000 nodes in EC2 with m1.xlarge 
 instances. Each node configured with 32 vnodes.
 Without vnodes, cluster spins up fine and is ready to handle requests within 
 30 minutes or less. 
 With vnodes, nodes are reporting constant up/down flapping messages with no 
 external load on the cluster. After a couple of hours, they were still 
 flapping, had very high cpu load, and the cluster never looked like it was 
 going to stabilize or be useful for traffic.



--
This message was sent by Atlassian JIRA
(v6.1#6144)