[jira] [Comment Edited] (CASSANDRA-9607) Get high load after upgrading from 2.1.3 to cassandra 2.1.6

2015-06-30 Thread Study Hsueh (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-9607?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14592119#comment-14592119
 ] 

Study Hsueh edited comment on CASSANDRA-9607 at 7/1/15 2:23 AM:


2015-06-15 13:40:41,200 upgrade to 2.1.6
2015-06-17 18:32:40,740 whole cluster went down




was (Author: study):
I had uploaded heap dump when OOM occurred: 
http://54.199.247.66/java_1434380208.hprof

2015-06-15 13:40:41,200 upgrade to 2.1.6
2015-06-17 18:32:40,740 whole cluster went down



 Get high load after upgrading from 2.1.3 to cassandra 2.1.6
 ---

 Key: CASSANDRA-9607
 URL: https://issues.apache.org/jira/browse/CASSANDRA-9607
 Project: Cassandra
  Issue Type: Bug
 Environment: OS: 
 CentOS 6 * 4
 Ubuntu 14.04 * 2
 JDK: Oracle JDK 7, Oracle JDK 8
 VM: Azure VM Standard A3 * 6
 RAM: 7 GB
 Cores: 4
Reporter: Study Hsueh
Assignee: Tyler Hobbs
Priority: Critical
 Fix For: 2.1.x, 2.2.x

 Attachments: GC_state.png, cassandra.yaml, client_blocked_thread.png, 
 cpu_profile.png, dump.tdump, load.png, log.zip, schema.zip, vm_monitor.png


 After upgrading cassandra version from 2.1.3 to 2.1.6, the average load of my 
 cassandra cluster grows from 0.x~1.x to 3.x~6.x. 
 What kind of additional information should I provide for this problem?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (CASSANDRA-9607) Get high load after upgrading from 2.1.3 to cassandra 2.1.6

2015-06-24 Thread Study Hsueh (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-9607?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14600666#comment-14600666
 ] 

Study Hsueh edited comment on CASSANDRA-9607 at 6/25/15 4:37 AM:
-

My colleague has repeated the query in 2.1.3 again, and the cluster went down 
again. So the root cause should be the query.


was (Author: study):
My colleague have repeated the query in 2.1.3 again, and the cluster went down 
again. So the root cause should be the query.

 Get high load after upgrading from 2.1.3 to cassandra 2.1.6
 ---

 Key: CASSANDRA-9607
 URL: https://issues.apache.org/jira/browse/CASSANDRA-9607
 Project: Cassandra
  Issue Type: Bug
 Environment: OS: 
 CentOS 6 * 4
 Ubuntu 14.04 * 2
 JDK: Oracle JDK 7, Oracle JDK 8
 VM: Azure VM Standard A3 * 6
 RAM: 7 GB
 Cores: 4
Reporter: Study Hsueh
Assignee: Tyler Hobbs
Priority: Critical
 Fix For: 2.1.x, 2.2.x

 Attachments: GC_state.png, cassandra.yaml, client_blocked_thread.png, 
 cpu_profile.png, dump.tdump, load.png, log.zip, schema.zip, vm_monitor.png


 After upgrading cassandra version from 2.1.3 to 2.1.6, the average load of my 
 cassandra cluster grows from 0.x~1.x to 3.x~6.x. 
 What kind of additional information should I provide for this problem?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (CASSANDRA-9607) Get high load after upgrading from 2.1.3 to cassandra 2.1.6

2015-06-22 Thread Robbie Strickland (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-9607?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14596605#comment-14596605
 ] 

Robbie Strickland edited comment on CASSANDRA-9607 at 6/22/15 8:43 PM:
---

I was able to bring the data down to my local machine and replicate the issue 
on a fresh 2.1.5 install while profiling.  I've attached screen shots of the 
session, as well as the thread state on the client while it's happening.  You 
can see the server blocking on select and the client blocking on accept, which 
of course causes both ends to become unresponsive.


was (Author: rstrickland):
I was able to bring the data down to my local machine and replicate the issue 
on a fresh install while profiling.  I've attached screen shots of the session, 
as well as the thread state on the client while it's happening.  You can see 
the server blocking on select and the client blocking on accept, which of 
course causes both ends to become unresponsive.

 Get high load after upgrading from 2.1.3 to cassandra 2.1.6
 ---

 Key: CASSANDRA-9607
 URL: https://issues.apache.org/jira/browse/CASSANDRA-9607
 Project: Cassandra
  Issue Type: Bug
 Environment: OS: 
 CentOS 6 * 4
 Ubuntu 14.04 * 2
 JDK: Oracle JDK 7, Oracle JDK 8
 VM: Azure VM Standard A3 * 6
 RAM: 7 GB
 Cores: 4
Reporter: Study Hsueh
Assignee: Tyler Hobbs
Priority: Critical
 Fix For: 2.1.x, 2.2.x

 Attachments: GC_state.png, cassandra.yaml, client_blocked_thread.png, 
 cpu_profile.png, dump.tdump, load.png, log.zip, schema.zip, vm_monitor.png


 After upgrading cassandra version from 2.1.3 to 2.1.6, the average load of my 
 cassandra cluster grows from 0.x~1.x to 3.x~6.x. 
 What kind of additional information should I provide for this problem?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (CASSANDRA-9607) Get high load after upgrading from 2.1.3 to cassandra 2.1.6

2015-06-22 Thread Robbie Strickland (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-9607?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14595975#comment-14595975
 ] 

Robbie Strickland edited comment on CASSANDRA-9607 at 6/22/15 2:25 PM:
---

I did some further testing on this, to at least isolate the affected versions 
and nail down steps to reproduce.  I started with a test cluster, 3 x 
i2.4xlarge instances running 2.1.4 and Spark 1.2.2.  I loaded a few sstables 
(using sstableloader) from one of my problematic production tables, and each 
test involved the following simple statement in the Spark shell:

{code}
sc.cassandraTable(mykeyspace, mytable).take(100).foreach(println)
{code}

I then upgraded to 2.1.6, loaded about 100G of data, and downgraded back to 
2.1.4.  The net result is that 2.1.4 works with both the small and larger data 
sets, while 2.1.5 and 2.1.6 work only with only the smaller data sets (the 
2.1.5 result comes from my production cluster).  The larger set caused the hang 
in both cases.


was (Author: rstrickland):
I did some further testing on this, to at least isolate the affected versions 
and nail down steps to reproduce.  I started with a test cluster, 3 x 
i2.4xlarge instances running 2.1.4 and Spark 1.2.2.  I loaded a few sstables 
(using sstableloader) from one of my problematic production tables, and each 
test involved the following simple statement in the Spark shell:

{code}
sc.cassandraTable(prod_analytics_events, 
profileevents).take(100).foreach(println)
{code}

I then upgraded to 2.1.6, loaded about 100G of data, and downgraded back to 
2.1.4.  The net result is that 2.1.4 works with both the small and larger data 
sets, while 2.1.5 and 2.1.6 work only with only the smaller data sets (the 
2.1.5 result comes from my production cluster).  The larger set caused the hang 
in both cases.

 Get high load after upgrading from 2.1.3 to cassandra 2.1.6
 ---

 Key: CASSANDRA-9607
 URL: https://issues.apache.org/jira/browse/CASSANDRA-9607
 Project: Cassandra
  Issue Type: Bug
 Environment: OS: 
 CentOS 6 * 4
 Ubuntu 14.04 * 2
 JDK: Oracle JDK 7, Oracle JDK 8
 VM: Azure VM Standard A3 * 6
 RAM: 7 GB
 Cores: 4
Reporter: Study Hsueh
Assignee: Tyler Hobbs
Priority: Critical
 Fix For: 2.1.x, 2.2.0 rc2

 Attachments: cassandra.yaml, load.png, log.zip, schema.zip


 After upgrading cassandra version from 2.1.3 to 2.1.6, the average load of my 
 cassandra cluster grows from 0.x~1.x to 3.x~6.x. 
 What kind of additional information should I provide for this problem?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (CASSANDRA-9607) Get high load after upgrading from 2.1.3 to cassandra 2.1.6

2015-06-22 Thread Robbie Strickland (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-9607?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14595975#comment-14595975
 ] 

Robbie Strickland edited comment on CASSANDRA-9607 at 6/22/15 2:24 PM:
---

I did some further testing on this, to at least isolate the affected versions 
and nail down steps to reproduce.  I started with a test cluster, 3 x 
i2.4xlarge instances running 2.1.4 and Spark 1.2.2.  I loaded a few sstables 
(using sstableloader) from one of my problematic production tables, and each 
test involved the following simple statement in the Spark shell:

{code}
sc.cassandraTable(prod_analytics_events, 
profileevents).take(100).foreach(println)
{code}

I then upgraded to 2.1.6, loaded about 100G of data, and downgraded back to 
2.1.4.  The net result is that 2.1.4 works with both the small and larger data 
sets, while 2.1.5 and 2.1.6 work only with only the smaller data sets (the 
2.1.5 result comes from my production cluster).  The larger set caused the hang 
in both cases.


was (Author: rstrickland):
I did some further testing on this, to at least isolate the affected versions 
and nail down steps to reproduce.  I started with a test cluster, 3 x 
i2.4xlarge instances running 2.1.4 and Spark 1.2.2.  I loaded a few sstables 
(using sstableloader) from one of my problematic production tables, and each 
test involved the following simple statement in the Spark shell:

{code}
sc.cassandraTable(prod_analytics_events, 
profileevents).take(100).foreach(println)
{code}

I then upgraded to 2.1.6, loaded about 100G of data, and downgraded back to 
2.1.4.  The net result is that 2.1.4 works with both the small and larger data 
sets, while 2.1.5 and 2.1.6 work only with only the smaller data sets.  The 
larger set caused the hang in both cases.

 Get high load after upgrading from 2.1.3 to cassandra 2.1.6
 ---

 Key: CASSANDRA-9607
 URL: https://issues.apache.org/jira/browse/CASSANDRA-9607
 Project: Cassandra
  Issue Type: Bug
 Environment: OS: 
 CentOS 6 * 4
 Ubuntu 14.04 * 2
 JDK: Oracle JDK 7, Oracle JDK 8
 VM: Azure VM Standard A3 * 6
 RAM: 7 GB
 Cores: 4
Reporter: Study Hsueh
Assignee: Tyler Hobbs
Priority: Critical
 Fix For: 2.1.x, 2.2.0 rc2

 Attachments: cassandra.yaml, load.png, log.zip, schema.zip


 After upgrading cassandra version from 2.1.3 to 2.1.6, the average load of my 
 cassandra cluster grows from 0.x~1.x to 3.x~6.x. 
 What kind of additional information should I provide for this problem?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (CASSANDRA-9607) Get high load after upgrading from 2.1.3 to cassandra 2.1.6

2015-06-18 Thread Robbie Strickland (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-9607?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14591977#comment-14591977
 ] 

Robbie Strickland edited comment on CASSANDRA-9607 at 6/18/15 3:32 PM:
---

Yes it does.  It seems from our observation to be related to 1) our heavy use 
of DTCS combined with 
[CASSANDRA-9549|https://issues.apache.org/jira/browse/CASSANDRA-9549] and 2) 
severe GC pauses (such that it was GCing constantly).  

We were able to make things stable by moving to G1 with the following 
modifications:

{code}
#JVM_OPTS=$JVM_OPTS -Xmn${HEAP_NEWSIZE}
JVM_OPTS=$JVM_OPTS -XX:+UseG1GC
JVM_OPTS=$JVM_OPTS -XX:MaxGCPauseMillis=1000
JVM_OPTS=$JVM_OPTS -XX:+UseTLAB -XX:+ResizeTLAB
JVM_OPTS=$JVM_OPTS -XX:-UseBiasedLocking
{code}

This is still a work in progress, but it has allowed us to reach a stable state.


was (Author: rstrickland):
Yes it does.  It seems from our observation to be related to 1) our heavy use 
of DTCS combined with 
[CASSANDRA-9549|https://issues.apache.org/jira/browse/CASSANDRA-9549] and 2) 
severe GC pauses (such that it was GCing constantly).  

We were able to make things stable by moving to G1 with the following 
modifications:

{{code}}
#JVM_OPTS=$JVM_OPTS -Xmn${HEAP_NEWSIZE}
JVM_OPTS=$JVM_OPTS -XX:+UseG1GC
JVM_OPTS=$JVM_OPTS -XX:MaxGCPauseMillis=1000
JVM_OPTS=$JVM_OPTS -XX:+UseTLAB -XX:+ResizeTLAB
JVM_OPTS=$JVM_OPTS -XX:-UseBiasedLocking
{{code}}

This is still a work in progress, but it has allowed us to reach a stable state.

 Get high load after upgrading from 2.1.3 to cassandra 2.1.6
 ---

 Key: CASSANDRA-9607
 URL: https://issues.apache.org/jira/browse/CASSANDRA-9607
 Project: Cassandra
  Issue Type: Bug
 Environment: OS: 
 CentOS 6 * 4
 Ubuntu 14.04 * 2
 JDK: Oracle JDK 7
Reporter: Study Hsueh
Priority: Critical
 Attachments: load.png


 After upgrading cassandra version from 2.1.3 to 2.1.6, the average load of my 
 cassandra cluster grows from 0.x~1.x to 3.x~6.x. 
 What kind of additional information should I provide for this problem?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (CASSANDRA-9607) Get high load after upgrading from 2.1.3 to cassandra 2.1.6

2015-06-18 Thread Study Hsueh (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-9607?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14592119#comment-14592119
 ] 

Study Hsueh edited comment on CASSANDRA-9607 at 6/18/15 5:41 PM:
-

cassandra configuration
heap dump when oom: http://54.199.247.66/java_1434380208.hprof


was (Author: study):
cassandra configuration

 Get high load after upgrading from 2.1.3 to cassandra 2.1.6
 ---

 Key: CASSANDRA-9607
 URL: https://issues.apache.org/jira/browse/CASSANDRA-9607
 Project: Cassandra
  Issue Type: Bug
 Environment: OS: 
 CentOS 6 * 4
 Ubuntu 14.04 * 2
 JDK: Oracle JDK 7, Oracle JDK 8
 VM: Azure VM Standard A3 * 6
 RAM: 7 GB
 Cores: 4
Reporter: Study Hsueh
Priority: Critical
 Attachments: cassandra.yaml, load.png, log.zip


 After upgrading cassandra version from 2.1.3 to 2.1.6, the average load of my 
 cassandra cluster grows from 0.x~1.x to 3.x~6.x. 
 What kind of additional information should I provide for this problem?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (CASSANDRA-9607) Get high load after upgrading from 2.1.3 to cassandra 2.1.6

2015-06-18 Thread Study Hsueh (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-9607?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14592119#comment-14592119
 ] 

Study Hsueh edited comment on CASSANDRA-9607 at 6/18/15 5:43 PM:
-

I had uploaded heap dump when OOM occurred: 
http://54.199.247.66/java_1434380208.hprof

2015-06-15 13:40:41,200 upgrade to 2.1.6
2015-06-17 18:32:40,740 whole cluster went down




was (Author: study):
cassandra configuration
heap dump when oom: http://54.199.247.66/java_1434380208.hprof

 Get high load after upgrading from 2.1.3 to cassandra 2.1.6
 ---

 Key: CASSANDRA-9607
 URL: https://issues.apache.org/jira/browse/CASSANDRA-9607
 Project: Cassandra
  Issue Type: Bug
 Environment: OS: 
 CentOS 6 * 4
 Ubuntu 14.04 * 2
 JDK: Oracle JDK 7, Oracle JDK 8
 VM: Azure VM Standard A3 * 6
 RAM: 7 GB
 Cores: 4
Reporter: Study Hsueh
Priority: Critical
 Attachments: cassandra.yaml, load.png, log.zip


 After upgrading cassandra version from 2.1.3 to 2.1.6, the average load of my 
 cassandra cluster grows from 0.x~1.x to 3.x~6.x. 
 What kind of additional information should I provide for this problem?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (CASSANDRA-9607) Get high load after upgrading from 2.1.3 to cassandra 2.1.6

2015-06-18 Thread Benedict (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-9607?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14592186#comment-14592186
 ] 

Benedict edited comment on CASSANDRA-9607 at 6/18/15 5:58 PM:
--

bq. I had uploaded heap dump when OOM occurred:

Ah, you hadn't mentioned OOM. In that case it is _highly likely_ you are 
experiencing CASSANDRA-9549, which will be fixed in 2.1.7 released shortly, or 
you can run the patch version posted to that ticket as [~jjirsa] has.


was (Author: benedict):
bq. I had uploaded heap dump when OOM occurred:

Ah, you hadn't mentioned OOM. In that case it is _highly likely_ you are 
experiencing CASSANDRA-9549, which will be fixed in 2.1.7 released shortly, or 
you can run the patch version posted to that ticket as [~rstrickland] has.

 Get high load after upgrading from 2.1.3 to cassandra 2.1.6
 ---

 Key: CASSANDRA-9607
 URL: https://issues.apache.org/jira/browse/CASSANDRA-9607
 Project: Cassandra
  Issue Type: Bug
 Environment: OS: 
 CentOS 6 * 4
 Ubuntu 14.04 * 2
 JDK: Oracle JDK 7, Oracle JDK 8
 VM: Azure VM Standard A3 * 6
 RAM: 7 GB
 Cores: 4
Reporter: Study Hsueh
Priority: Critical
 Attachments: cassandra.yaml, load.png, log.zip


 After upgrading cassandra version from 2.1.3 to 2.1.6, the average load of my 
 cassandra cluster grows from 0.x~1.x to 3.x~6.x. 
 What kind of additional information should I provide for this problem?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (CASSANDRA-9607) Get high load after upgrading from 2.1.3 to cassandra 2.1.6

2015-06-18 Thread Study Hsueh (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-9607?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14592117#comment-14592117
 ] 

Study Hsueh edited comment on CASSANDRA-9607 at 6/18/15 5:41 PM:
-

cassandra log
2015-06-15 13:40:41,200 upgrade to 2.1.6
2015-06-17 18:32:40,740 whole cluster went down


was (Author: study):
cassandra log
2015-06-15 13:40:41,200 upgrade to 2.1.6
2015-06-17 18:32:40,740 whole cluster went down

 Get high load after upgrading from 2.1.3 to cassandra 2.1.6
 ---

 Key: CASSANDRA-9607
 URL: https://issues.apache.org/jira/browse/CASSANDRA-9607
 Project: Cassandra
  Issue Type: Bug
 Environment: OS: 
 CentOS 6 * 4
 Ubuntu 14.04 * 2
 JDK: Oracle JDK 7, Oracle JDK 8
 VM: Azure VM Standard A3 * 6
 RAM: 7 GB
 Cores: 4
Reporter: Study Hsueh
Priority: Critical
 Attachments: cassandra.yaml, load.png, log.zip


 After upgrading cassandra version from 2.1.3 to 2.1.6, the average load of my 
 cassandra cluster grows from 0.x~1.x to 3.x~6.x. 
 What kind of additional information should I provide for this problem?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (CASSANDRA-9607) Get high load after upgrading from 2.1.3 to cassandra 2.1.6

2015-06-17 Thread Study Hsueh (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-9607?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14589589#comment-14589589
 ] 

Study Hsueh edited comment on CASSANDRA-9607 at 6/17/15 10:27 AM:
--

This issues cause all of nodes downs. I will attach log later after I downgrade 
to 2.1.3...


was (Author: study):
This issues cause all of nodes downs.

 Get high load after upgrading from 2.1.3 to cassandra 2.1.6
 ---

 Key: CASSANDRA-9607
 URL: https://issues.apache.org/jira/browse/CASSANDRA-9607
 Project: Cassandra
  Issue Type: Bug
 Environment: OS: 
 CentOS 6 * 4
 Ubuntu 14.04 * 2
 JDK: Oracle JDK 7
Reporter: Study Hsueh
Priority: Critical
 Attachments: load.png


 After upgrading cassandra version from 2.1.3 to 2.1.6, the average load of my 
 cassandra cluster grows from 0.x~1.x to 3.x~6.x. 
 What kind of additional information should I provide for this problem?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)