[jira] [Comment Edited] (CASSANDRA-9607) Get high load after upgrading from 2.1.3 to cassandra 2.1.6
[ https://issues.apache.org/jira/browse/CASSANDRA-9607?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14592119#comment-14592119 ] Study Hsueh edited comment on CASSANDRA-9607 at 7/1/15 2:23 AM: 2015-06-15 13:40:41,200 upgrade to 2.1.6 2015-06-17 18:32:40,740 whole cluster went down was (Author: study): I had uploaded heap dump when OOM occurred: http://54.199.247.66/java_1434380208.hprof 2015-06-15 13:40:41,200 upgrade to 2.1.6 2015-06-17 18:32:40,740 whole cluster went down Get high load after upgrading from 2.1.3 to cassandra 2.1.6 --- Key: CASSANDRA-9607 URL: https://issues.apache.org/jira/browse/CASSANDRA-9607 Project: Cassandra Issue Type: Bug Environment: OS: CentOS 6 * 4 Ubuntu 14.04 * 2 JDK: Oracle JDK 7, Oracle JDK 8 VM: Azure VM Standard A3 * 6 RAM: 7 GB Cores: 4 Reporter: Study Hsueh Assignee: Tyler Hobbs Priority: Critical Fix For: 2.1.x, 2.2.x Attachments: GC_state.png, cassandra.yaml, client_blocked_thread.png, cpu_profile.png, dump.tdump, load.png, log.zip, schema.zip, vm_monitor.png After upgrading cassandra version from 2.1.3 to 2.1.6, the average load of my cassandra cluster grows from 0.x~1.x to 3.x~6.x. What kind of additional information should I provide for this problem? -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Comment Edited] (CASSANDRA-9607) Get high load after upgrading from 2.1.3 to cassandra 2.1.6
[ https://issues.apache.org/jira/browse/CASSANDRA-9607?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14600666#comment-14600666 ] Study Hsueh edited comment on CASSANDRA-9607 at 6/25/15 4:37 AM: - My colleague has repeated the query in 2.1.3 again, and the cluster went down again. So the root cause should be the query. was (Author: study): My colleague have repeated the query in 2.1.3 again, and the cluster went down again. So the root cause should be the query. Get high load after upgrading from 2.1.3 to cassandra 2.1.6 --- Key: CASSANDRA-9607 URL: https://issues.apache.org/jira/browse/CASSANDRA-9607 Project: Cassandra Issue Type: Bug Environment: OS: CentOS 6 * 4 Ubuntu 14.04 * 2 JDK: Oracle JDK 7, Oracle JDK 8 VM: Azure VM Standard A3 * 6 RAM: 7 GB Cores: 4 Reporter: Study Hsueh Assignee: Tyler Hobbs Priority: Critical Fix For: 2.1.x, 2.2.x Attachments: GC_state.png, cassandra.yaml, client_blocked_thread.png, cpu_profile.png, dump.tdump, load.png, log.zip, schema.zip, vm_monitor.png After upgrading cassandra version from 2.1.3 to 2.1.6, the average load of my cassandra cluster grows from 0.x~1.x to 3.x~6.x. What kind of additional information should I provide for this problem? -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Comment Edited] (CASSANDRA-9607) Get high load after upgrading from 2.1.3 to cassandra 2.1.6
[ https://issues.apache.org/jira/browse/CASSANDRA-9607?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14596605#comment-14596605 ] Robbie Strickland edited comment on CASSANDRA-9607 at 6/22/15 8:43 PM: --- I was able to bring the data down to my local machine and replicate the issue on a fresh 2.1.5 install while profiling. I've attached screen shots of the session, as well as the thread state on the client while it's happening. You can see the server blocking on select and the client blocking on accept, which of course causes both ends to become unresponsive. was (Author: rstrickland): I was able to bring the data down to my local machine and replicate the issue on a fresh install while profiling. I've attached screen shots of the session, as well as the thread state on the client while it's happening. You can see the server blocking on select and the client blocking on accept, which of course causes both ends to become unresponsive. Get high load after upgrading from 2.1.3 to cassandra 2.1.6 --- Key: CASSANDRA-9607 URL: https://issues.apache.org/jira/browse/CASSANDRA-9607 Project: Cassandra Issue Type: Bug Environment: OS: CentOS 6 * 4 Ubuntu 14.04 * 2 JDK: Oracle JDK 7, Oracle JDK 8 VM: Azure VM Standard A3 * 6 RAM: 7 GB Cores: 4 Reporter: Study Hsueh Assignee: Tyler Hobbs Priority: Critical Fix For: 2.1.x, 2.2.x Attachments: GC_state.png, cassandra.yaml, client_blocked_thread.png, cpu_profile.png, dump.tdump, load.png, log.zip, schema.zip, vm_monitor.png After upgrading cassandra version from 2.1.3 to 2.1.6, the average load of my cassandra cluster grows from 0.x~1.x to 3.x~6.x. What kind of additional information should I provide for this problem? -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Comment Edited] (CASSANDRA-9607) Get high load after upgrading from 2.1.3 to cassandra 2.1.6
[ https://issues.apache.org/jira/browse/CASSANDRA-9607?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14595975#comment-14595975 ] Robbie Strickland edited comment on CASSANDRA-9607 at 6/22/15 2:25 PM: --- I did some further testing on this, to at least isolate the affected versions and nail down steps to reproduce. I started with a test cluster, 3 x i2.4xlarge instances running 2.1.4 and Spark 1.2.2. I loaded a few sstables (using sstableloader) from one of my problematic production tables, and each test involved the following simple statement in the Spark shell: {code} sc.cassandraTable(mykeyspace, mytable).take(100).foreach(println) {code} I then upgraded to 2.1.6, loaded about 100G of data, and downgraded back to 2.1.4. The net result is that 2.1.4 works with both the small and larger data sets, while 2.1.5 and 2.1.6 work only with only the smaller data sets (the 2.1.5 result comes from my production cluster). The larger set caused the hang in both cases. was (Author: rstrickland): I did some further testing on this, to at least isolate the affected versions and nail down steps to reproduce. I started with a test cluster, 3 x i2.4xlarge instances running 2.1.4 and Spark 1.2.2. I loaded a few sstables (using sstableloader) from one of my problematic production tables, and each test involved the following simple statement in the Spark shell: {code} sc.cassandraTable(prod_analytics_events, profileevents).take(100).foreach(println) {code} I then upgraded to 2.1.6, loaded about 100G of data, and downgraded back to 2.1.4. The net result is that 2.1.4 works with both the small and larger data sets, while 2.1.5 and 2.1.6 work only with only the smaller data sets (the 2.1.5 result comes from my production cluster). The larger set caused the hang in both cases. Get high load after upgrading from 2.1.3 to cassandra 2.1.6 --- Key: CASSANDRA-9607 URL: https://issues.apache.org/jira/browse/CASSANDRA-9607 Project: Cassandra Issue Type: Bug Environment: OS: CentOS 6 * 4 Ubuntu 14.04 * 2 JDK: Oracle JDK 7, Oracle JDK 8 VM: Azure VM Standard A3 * 6 RAM: 7 GB Cores: 4 Reporter: Study Hsueh Assignee: Tyler Hobbs Priority: Critical Fix For: 2.1.x, 2.2.0 rc2 Attachments: cassandra.yaml, load.png, log.zip, schema.zip After upgrading cassandra version from 2.1.3 to 2.1.6, the average load of my cassandra cluster grows from 0.x~1.x to 3.x~6.x. What kind of additional information should I provide for this problem? -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Comment Edited] (CASSANDRA-9607) Get high load after upgrading from 2.1.3 to cassandra 2.1.6
[ https://issues.apache.org/jira/browse/CASSANDRA-9607?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14595975#comment-14595975 ] Robbie Strickland edited comment on CASSANDRA-9607 at 6/22/15 2:24 PM: --- I did some further testing on this, to at least isolate the affected versions and nail down steps to reproduce. I started with a test cluster, 3 x i2.4xlarge instances running 2.1.4 and Spark 1.2.2. I loaded a few sstables (using sstableloader) from one of my problematic production tables, and each test involved the following simple statement in the Spark shell: {code} sc.cassandraTable(prod_analytics_events, profileevents).take(100).foreach(println) {code} I then upgraded to 2.1.6, loaded about 100G of data, and downgraded back to 2.1.4. The net result is that 2.1.4 works with both the small and larger data sets, while 2.1.5 and 2.1.6 work only with only the smaller data sets (the 2.1.5 result comes from my production cluster). The larger set caused the hang in both cases. was (Author: rstrickland): I did some further testing on this, to at least isolate the affected versions and nail down steps to reproduce. I started with a test cluster, 3 x i2.4xlarge instances running 2.1.4 and Spark 1.2.2. I loaded a few sstables (using sstableloader) from one of my problematic production tables, and each test involved the following simple statement in the Spark shell: {code} sc.cassandraTable(prod_analytics_events, profileevents).take(100).foreach(println) {code} I then upgraded to 2.1.6, loaded about 100G of data, and downgraded back to 2.1.4. The net result is that 2.1.4 works with both the small and larger data sets, while 2.1.5 and 2.1.6 work only with only the smaller data sets. The larger set caused the hang in both cases. Get high load after upgrading from 2.1.3 to cassandra 2.1.6 --- Key: CASSANDRA-9607 URL: https://issues.apache.org/jira/browse/CASSANDRA-9607 Project: Cassandra Issue Type: Bug Environment: OS: CentOS 6 * 4 Ubuntu 14.04 * 2 JDK: Oracle JDK 7, Oracle JDK 8 VM: Azure VM Standard A3 * 6 RAM: 7 GB Cores: 4 Reporter: Study Hsueh Assignee: Tyler Hobbs Priority: Critical Fix For: 2.1.x, 2.2.0 rc2 Attachments: cassandra.yaml, load.png, log.zip, schema.zip After upgrading cassandra version from 2.1.3 to 2.1.6, the average load of my cassandra cluster grows from 0.x~1.x to 3.x~6.x. What kind of additional information should I provide for this problem? -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Comment Edited] (CASSANDRA-9607) Get high load after upgrading from 2.1.3 to cassandra 2.1.6
[ https://issues.apache.org/jira/browse/CASSANDRA-9607?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14591977#comment-14591977 ] Robbie Strickland edited comment on CASSANDRA-9607 at 6/18/15 3:32 PM: --- Yes it does. It seems from our observation to be related to 1) our heavy use of DTCS combined with [CASSANDRA-9549|https://issues.apache.org/jira/browse/CASSANDRA-9549] and 2) severe GC pauses (such that it was GCing constantly). We were able to make things stable by moving to G1 with the following modifications: {code} #JVM_OPTS=$JVM_OPTS -Xmn${HEAP_NEWSIZE} JVM_OPTS=$JVM_OPTS -XX:+UseG1GC JVM_OPTS=$JVM_OPTS -XX:MaxGCPauseMillis=1000 JVM_OPTS=$JVM_OPTS -XX:+UseTLAB -XX:+ResizeTLAB JVM_OPTS=$JVM_OPTS -XX:-UseBiasedLocking {code} This is still a work in progress, but it has allowed us to reach a stable state. was (Author: rstrickland): Yes it does. It seems from our observation to be related to 1) our heavy use of DTCS combined with [CASSANDRA-9549|https://issues.apache.org/jira/browse/CASSANDRA-9549] and 2) severe GC pauses (such that it was GCing constantly). We were able to make things stable by moving to G1 with the following modifications: {{code}} #JVM_OPTS=$JVM_OPTS -Xmn${HEAP_NEWSIZE} JVM_OPTS=$JVM_OPTS -XX:+UseG1GC JVM_OPTS=$JVM_OPTS -XX:MaxGCPauseMillis=1000 JVM_OPTS=$JVM_OPTS -XX:+UseTLAB -XX:+ResizeTLAB JVM_OPTS=$JVM_OPTS -XX:-UseBiasedLocking {{code}} This is still a work in progress, but it has allowed us to reach a stable state. Get high load after upgrading from 2.1.3 to cassandra 2.1.6 --- Key: CASSANDRA-9607 URL: https://issues.apache.org/jira/browse/CASSANDRA-9607 Project: Cassandra Issue Type: Bug Environment: OS: CentOS 6 * 4 Ubuntu 14.04 * 2 JDK: Oracle JDK 7 Reporter: Study Hsueh Priority: Critical Attachments: load.png After upgrading cassandra version from 2.1.3 to 2.1.6, the average load of my cassandra cluster grows from 0.x~1.x to 3.x~6.x. What kind of additional information should I provide for this problem? -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Comment Edited] (CASSANDRA-9607) Get high load after upgrading from 2.1.3 to cassandra 2.1.6
[ https://issues.apache.org/jira/browse/CASSANDRA-9607?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14592119#comment-14592119 ] Study Hsueh edited comment on CASSANDRA-9607 at 6/18/15 5:41 PM: - cassandra configuration heap dump when oom: http://54.199.247.66/java_1434380208.hprof was (Author: study): cassandra configuration Get high load after upgrading from 2.1.3 to cassandra 2.1.6 --- Key: CASSANDRA-9607 URL: https://issues.apache.org/jira/browse/CASSANDRA-9607 Project: Cassandra Issue Type: Bug Environment: OS: CentOS 6 * 4 Ubuntu 14.04 * 2 JDK: Oracle JDK 7, Oracle JDK 8 VM: Azure VM Standard A3 * 6 RAM: 7 GB Cores: 4 Reporter: Study Hsueh Priority: Critical Attachments: cassandra.yaml, load.png, log.zip After upgrading cassandra version from 2.1.3 to 2.1.6, the average load of my cassandra cluster grows from 0.x~1.x to 3.x~6.x. What kind of additional information should I provide for this problem? -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Comment Edited] (CASSANDRA-9607) Get high load after upgrading from 2.1.3 to cassandra 2.1.6
[ https://issues.apache.org/jira/browse/CASSANDRA-9607?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14592119#comment-14592119 ] Study Hsueh edited comment on CASSANDRA-9607 at 6/18/15 5:43 PM: - I had uploaded heap dump when OOM occurred: http://54.199.247.66/java_1434380208.hprof 2015-06-15 13:40:41,200 upgrade to 2.1.6 2015-06-17 18:32:40,740 whole cluster went down was (Author: study): cassandra configuration heap dump when oom: http://54.199.247.66/java_1434380208.hprof Get high load after upgrading from 2.1.3 to cassandra 2.1.6 --- Key: CASSANDRA-9607 URL: https://issues.apache.org/jira/browse/CASSANDRA-9607 Project: Cassandra Issue Type: Bug Environment: OS: CentOS 6 * 4 Ubuntu 14.04 * 2 JDK: Oracle JDK 7, Oracle JDK 8 VM: Azure VM Standard A3 * 6 RAM: 7 GB Cores: 4 Reporter: Study Hsueh Priority: Critical Attachments: cassandra.yaml, load.png, log.zip After upgrading cassandra version from 2.1.3 to 2.1.6, the average load of my cassandra cluster grows from 0.x~1.x to 3.x~6.x. What kind of additional information should I provide for this problem? -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Comment Edited] (CASSANDRA-9607) Get high load after upgrading from 2.1.3 to cassandra 2.1.6
[ https://issues.apache.org/jira/browse/CASSANDRA-9607?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14592186#comment-14592186 ] Benedict edited comment on CASSANDRA-9607 at 6/18/15 5:58 PM: -- bq. I had uploaded heap dump when OOM occurred: Ah, you hadn't mentioned OOM. In that case it is _highly likely_ you are experiencing CASSANDRA-9549, which will be fixed in 2.1.7 released shortly, or you can run the patch version posted to that ticket as [~jjirsa] has. was (Author: benedict): bq. I had uploaded heap dump when OOM occurred: Ah, you hadn't mentioned OOM. In that case it is _highly likely_ you are experiencing CASSANDRA-9549, which will be fixed in 2.1.7 released shortly, or you can run the patch version posted to that ticket as [~rstrickland] has. Get high load after upgrading from 2.1.3 to cassandra 2.1.6 --- Key: CASSANDRA-9607 URL: https://issues.apache.org/jira/browse/CASSANDRA-9607 Project: Cassandra Issue Type: Bug Environment: OS: CentOS 6 * 4 Ubuntu 14.04 * 2 JDK: Oracle JDK 7, Oracle JDK 8 VM: Azure VM Standard A3 * 6 RAM: 7 GB Cores: 4 Reporter: Study Hsueh Priority: Critical Attachments: cassandra.yaml, load.png, log.zip After upgrading cassandra version from 2.1.3 to 2.1.6, the average load of my cassandra cluster grows from 0.x~1.x to 3.x~6.x. What kind of additional information should I provide for this problem? -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Comment Edited] (CASSANDRA-9607) Get high load after upgrading from 2.1.3 to cassandra 2.1.6
[ https://issues.apache.org/jira/browse/CASSANDRA-9607?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14592117#comment-14592117 ] Study Hsueh edited comment on CASSANDRA-9607 at 6/18/15 5:41 PM: - cassandra log 2015-06-15 13:40:41,200 upgrade to 2.1.6 2015-06-17 18:32:40,740 whole cluster went down was (Author: study): cassandra log 2015-06-15 13:40:41,200 upgrade to 2.1.6 2015-06-17 18:32:40,740 whole cluster went down Get high load after upgrading from 2.1.3 to cassandra 2.1.6 --- Key: CASSANDRA-9607 URL: https://issues.apache.org/jira/browse/CASSANDRA-9607 Project: Cassandra Issue Type: Bug Environment: OS: CentOS 6 * 4 Ubuntu 14.04 * 2 JDK: Oracle JDK 7, Oracle JDK 8 VM: Azure VM Standard A3 * 6 RAM: 7 GB Cores: 4 Reporter: Study Hsueh Priority: Critical Attachments: cassandra.yaml, load.png, log.zip After upgrading cassandra version from 2.1.3 to 2.1.6, the average load of my cassandra cluster grows from 0.x~1.x to 3.x~6.x. What kind of additional information should I provide for this problem? -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Comment Edited] (CASSANDRA-9607) Get high load after upgrading from 2.1.3 to cassandra 2.1.6
[ https://issues.apache.org/jira/browse/CASSANDRA-9607?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14589589#comment-14589589 ] Study Hsueh edited comment on CASSANDRA-9607 at 6/17/15 10:27 AM: -- This issues cause all of nodes downs. I will attach log later after I downgrade to 2.1.3... was (Author: study): This issues cause all of nodes downs. Get high load after upgrading from 2.1.3 to cassandra 2.1.6 --- Key: CASSANDRA-9607 URL: https://issues.apache.org/jira/browse/CASSANDRA-9607 Project: Cassandra Issue Type: Bug Environment: OS: CentOS 6 * 4 Ubuntu 14.04 * 2 JDK: Oracle JDK 7 Reporter: Study Hsueh Priority: Critical Attachments: load.png After upgrading cassandra version from 2.1.3 to 2.1.6, the average load of my cassandra cluster grows from 0.x~1.x to 3.x~6.x. What kind of additional information should I provide for this problem? -- This message was sent by Atlassian JIRA (v6.3.4#6332)