[jira] [Updated] (CASSANDRA-11504) Slow inter-node network growth & gc issues with uptime
[ https://issues.apache.org/jira/browse/CASSANDRA-11504?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jeff Griffith updated CASSANDRA-11504: -- Environment: Cassandra 2.1.13 & 2.2.5 (was: Cassandra 2.1.13) > Slow inter-node network growth & gc issues with uptime > -- > > Key: CASSANDRA-11504 > URL: https://issues.apache.org/jira/browse/CASSANDRA-11504 > Project: Cassandra > Issue Type: Bug > Environment: Cassandra 2.1.13 & 2.2.5 >Reporter: Jeff Griffith > Attachments: InterNodeTraffic.jpg > > > We are looking for help troubleshooting our production environment where we > are experiencing GC problems. After much experimentation and troubleshooting > with various settings, the only correlation that we can find with a slow > growth in GC is a slow growth in network traffic BETWEEN cassandra nodes in > our cluster. As an example, I have attached an example where in a cluster of > 24 nodes, i restarted 23 of them. Note that the outgoing rate for that 24th > node remains high while all others drop after the restart. Also note that > this graph is ONLY traffic between cassandra nodes. Traffic from the clients > remains FLAT throughout. Analyzing column family stats shows they are flat > throughout. Cache hit rates are also consistent across nodes. GC is of course > its own can of worms so we are hoping this considerable increase in traffic > (more than double over the course of 6rs) between nodes explains it. We would > greatly appreciate any ideas as to why this extra network output correlates > to uptime or ideas on what to "diff" between the nodes. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (CASSANDRA-11504) Slow inter-node network growth & gc issues with uptime
[ https://issues.apache.org/jira/browse/CASSANDRA-11504?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jeff Griffith updated CASSANDRA-11504: -- Description: We are looking for help troubleshooting our production environment where we are experiencing GC problems. After much experimentation and troubleshooting with various settings, the only correlation that we can find with a slow growth in GC is a slow growth in network traffic BETWEEN cassandra nodes in our cluster. As an example, I have attached an example where in a cluster of 24 nodes, i restarted 23 of them. Note that the outgoing rate for that 24th node remains high while all others drop after the restart. Also note that this graph is ONLY traffic between cassandra nodes. Traffic from the clients remains FLAT throughout. Analyzing column family stats shows they are flat throughout. Cache hit rates are also consistent across nodes. GC is of course its own can of worms so we are hoping this considerable increase in traffic (more than double over the course of 6rs) between nodes explains it. We would greatly appreciate any ideas as to why this extra network output correlates to uptime or ideas on what to "diff" between the nodes. (was: We are looking for help troubleshooting our production environment where we are experiencing GC problems. After much experimentation and troubleshooting with various settings, the only correlation that we can find with a slow growth in GC a slow growth in network traffic BETWEEN cassandra nodes in our cluster. As an example, I have attached an example where in a cluster of 24 nodes, i restarted 23 of them. Note that the outgoing rate for that 24th node remains high while all others drop after the restart. Also note that this graph is ONLY traffic between cassandra nodes. Traffic from the clients remains FLAT throughout. Analyzing column family stats shows they are flat throughout. Cache hit rates are also consistent across nodes. GC is of course its own can of worms so we are hoping this considerable increase in traffic (more than double over the course of 6rs) between nodes explains it. We would greatly appreciate any ideas as to why this extra network output correlates to uptime or ideas on what to "diff" between the nodes.) > Slow inter-node network growth & gc issues with uptime > -- > > Key: CASSANDRA-11504 > URL: https://issues.apache.org/jira/browse/CASSANDRA-11504 > Project: Cassandra > Issue Type: Bug > Environment: Cassandra 2.1.13 >Reporter: Jeff Griffith > Attachments: InterNodeTraffic.jpg > > > We are looking for help troubleshooting our production environment where we > are experiencing GC problems. After much experimentation and troubleshooting > with various settings, the only correlation that we can find with a slow > growth in GC is a slow growth in network traffic BETWEEN cassandra nodes in > our cluster. As an example, I have attached an example where in a cluster of > 24 nodes, i restarted 23 of them. Note that the outgoing rate for that 24th > node remains high while all others drop after the restart. Also note that > this graph is ONLY traffic between cassandra nodes. Traffic from the clients > remains FLAT throughout. Analyzing column family stats shows they are flat > throughout. Cache hit rates are also consistent across nodes. GC is of course > its own can of worms so we are hoping this considerable increase in traffic > (more than double over the course of 6rs) between nodes explains it. We would > greatly appreciate any ideas as to why this extra network output correlates > to uptime or ideas on what to "diff" between the nodes. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (CASSANDRA-11504) Slow inter-node network growth & gc issues with uptime
[ https://issues.apache.org/jira/browse/CASSANDRA-11504?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jeff Griffith updated CASSANDRA-11504: -- Description: We are looking for help troubleshooting our production environment where we are experiencing GC problems. After much experimentation and troubleshooting with various settings, the only correlation that we can find with a slow growth in GC a slow growth in network traffic BETWEEN cassandra nodes in our cluster. As an example, I have attached an example where in a cluster of 24 nodes, i restarted 23 of them. Note that the outgoing rate for that 24th node remains high while all others drop after the restart. Also note that this graph is ONLY traffic between cassandra nodes. Traffic from the clients remains FLAT throughout. Analyzing column family stats shows they are flat throughout. Cache hit rates are also consistent across nodes. GC is of course its own can of worms so we are hoping this considerable increase in traffic (more than double over the course of 6rs) between nodes explains it. We would greatly appreciate any ideas as to why this extra network output correlates to uptime or ideas on what to "diff" between the nodes. (was: We are looking for help troubleshooting our production environment where we are experiencing GC problems. After much experimentation and troubleshooting with various settings, the only correlation that we can find with a slow growth in GC a slow growth in network traffic BETWEEN cassandra nodes in our cluster. As an example, I have attached an example where in a cluster of 24 nodes, i restarted 23 of them. Note that the outgoing rate for that 24th node remains high while all others drop after the restart. Also note that this graph is ONLY traffic between cassandra nodes. Traffic from the clients remains FLAT throughout. Analyzing column family stats shows they are flat throughout. Cache hit rates are also consistent across nodes. GC is of course its own can of worms so we are hoping this considerable increase in traffic (more than double over the course of 6rs) between nodes explains it. We would greatly appreciate any ideas as to why this extra network output correlates to uptime.) > Slow inter-node network growth & gc issues with uptime > -- > > Key: CASSANDRA-11504 > URL: https://issues.apache.org/jira/browse/CASSANDRA-11504 > Project: Cassandra > Issue Type: Bug > Environment: Cassandra 2.1.13 >Reporter: Jeff Griffith > Attachments: InterNodeTraffic.jpg > > > We are looking for help troubleshooting our production environment where we > are experiencing GC problems. After much experimentation and troubleshooting > with various settings, the only correlation that we can find with a slow > growth in GC a slow growth in network traffic BETWEEN cassandra nodes in our > cluster. As an example, I have attached an example where in a cluster of 24 > nodes, i restarted 23 of them. Note that the outgoing rate for that 24th node > remains high while all others drop after the restart. Also note that this > graph is ONLY traffic between cassandra nodes. Traffic from the clients > remains FLAT throughout. Analyzing column family stats shows they are flat > throughout. Cache hit rates are also consistent across nodes. GC is of course > its own can of worms so we are hoping this considerable increase in traffic > (more than double over the course of 6rs) between nodes explains it. We would > greatly appreciate any ideas as to why this extra network output correlates > to uptime or ideas on what to "diff" between the nodes. -- This message was sent by Atlassian JIRA (v6.3.4#6332)