[jira] [Commented] (CASSANDRA-10449) OOM on bootstrap after long GC pause

2016-07-06 Thread Jeff Jirsa (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-10449?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15364543#comment-15364543
 ] 

Jeff Jirsa commented on CASSANDRA-10449:


If anything, it's a combination of CASSANDRA-10474 and CASSANDRA-9681 (as 
[~mishail] pointed out), and not CASSANDRA-10680. 

> OOM on bootstrap after long GC pause
> 
>
> Key: CASSANDRA-10449
> URL: https://issues.apache.org/jira/browse/CASSANDRA-10449
> Project: Cassandra
>  Issue Type: Bug
> Environment: Ubuntu 14.04, AWS
>Reporter: Robbie Strickland
>  Labels: gc
> Fix For: 2.1.x
>
> Attachments: GCpath.txt, heap_dump.png, system.log.10-05, 
> thread_dump.log, threads.txt
>
>
> I have a 20-node cluster (i2.4xlarge) with vnodes (default of 256) and 
> 500-700GB per node.  SSTable counts are <10 per table.  I am attempting to 
> provision additional nodes, but bootstrapping OOMs every time after about 10 
> hours with a sudden long GC pause:
> {noformat}
> INFO  [Service Thread] 2015-10-05 23:33:33,373 GCInspector.java:252 - G1 Old 
> Generation GC in 1586126ms.  G1 Old Gen: 49213756976 -> 49072277176;
> ...
> ERROR [MemtableFlushWriter:454] 2015-10-05 23:33:33,380 
> CassandraDaemon.java:223 - Exception in thread 
> Thread[MemtableFlushWriter:454,5,main]
> java.lang.OutOfMemoryError: Java heap space
> {noformat}
> I have tried increasing max heap to 48G just to get through the bootstrap, to 
> no avail.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-10449) OOM on bootstrap after long GC pause

2016-07-06 Thread Robbie Strickland (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-10449?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15364183#comment-15364183
 ] 

Robbie Strickland commented on CASSANDRA-10449:
---

[~pauloricardomg] can we change the resolution to something other than 
"duplicate" to avoid confusion? I still run across people who hit this issue, 
and seems it was resolved with an upgrade to 2.1.11.

> OOM on bootstrap after long GC pause
> 
>
> Key: CASSANDRA-10449
> URL: https://issues.apache.org/jira/browse/CASSANDRA-10449
> Project: Cassandra
>  Issue Type: Bug
> Environment: Ubuntu 14.04, AWS
>Reporter: Robbie Strickland
>  Labels: gc
> Fix For: 2.1.x
>
> Attachments: GCpath.txt, heap_dump.png, system.log.10-05, 
> thread_dump.log, threads.txt
>
>
> I have a 20-node cluster (i2.4xlarge) with vnodes (default of 256) and 
> 500-700GB per node.  SSTable counts are <10 per table.  I am attempting to 
> provision additional nodes, but bootstrapping OOMs every time after about 10 
> hours with a sudden long GC pause:
> {noformat}
> INFO  [Service Thread] 2015-10-05 23:33:33,373 GCInspector.java:252 - G1 Old 
> Generation GC in 1586126ms.  G1 Old Gen: 49213756976 -> 49072277176;
> ...
> ERROR [MemtableFlushWriter:454] 2015-10-05 23:33:33,380 
> CassandraDaemon.java:223 - Exception in thread 
> Thread[MemtableFlushWriter:454,5,main]
> java.lang.OutOfMemoryError: Java heap space
> {noformat}
> I have tried increasing max heap to 48G just to get through the bootstrap, to 
> no avail.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-10449) OOM on bootstrap after long GC pause

2016-05-11 Thread Paulo Motta (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-10449?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15281004#comment-15281004
 ] 

Paulo Motta commented on CASSANDRA-10449:
-

[~jjirsa] good catch! I'll leave it closed since there was no activity on this 
for a while, but we can always reopen if reports on newer versions come up.

If this is a still a problem we can probably add a configurable limit on the 
number of simultaneous streams during bootstrap. Another optimization we can 
probably do is to discard compression chunks after they're read during stream.

> OOM on bootstrap after long GC pause
> 
>
> Key: CASSANDRA-10449
> URL: https://issues.apache.org/jira/browse/CASSANDRA-10449
> Project: Cassandra
>  Issue Type: Bug
> Environment: Ubuntu 14.04, AWS
>Reporter: Robbie Strickland
>  Labels: gc
> Fix For: 2.1.x
>
> Attachments: GCpath.txt, heap_dump.png, system.log.10-05, 
> thread_dump.log, threads.txt
>
>
> I have a 20-node cluster (i2.4xlarge) with vnodes (default of 256) and 
> 500-700GB per node.  SSTable counts are <10 per table.  I am attempting to 
> provision additional nodes, but bootstrapping OOMs every time after about 10 
> hours with a sudden long GC pause:
> {noformat}
> INFO  [Service Thread] 2015-10-05 23:33:33,373 GCInspector.java:252 - G1 Old 
> Generation GC in 1586126ms.  G1 Old Gen: 49213756976 -> 49072277176;
> ...
> ERROR [MemtableFlushWriter:454] 2015-10-05 23:33:33,380 
> CassandraDaemon.java:223 - Exception in thread 
> Thread[MemtableFlushWriter:454,5,main]
> java.lang.OutOfMemoryError: Java heap space
> {noformat}
> I have tried increasing max heap to 48G just to get through the bootstrap, to 
> no avail.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-10449) OOM on bootstrap after long GC pause

2016-05-11 Thread Jeff Jirsa (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-10449?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15280674#comment-15280674
 ] 

Jeff Jirsa commented on CASSANDRA-10449:


CASSANDRA-10680 caused GC pressure on the stream senders not on the 
bootstrapping nodes



> OOM on bootstrap after long GC pause
> 
>
> Key: CASSANDRA-10449
> URL: https://issues.apache.org/jira/browse/CASSANDRA-10449
> Project: Cassandra
>  Issue Type: Bug
> Environment: Ubuntu 14.04, AWS
>Reporter: Robbie Strickland
>  Labels: gc
> Fix For: 2.1.x
>
> Attachments: GCpath.txt, heap_dump.png, system.log.10-05, 
> thread_dump.log, threads.txt
>
>
> I have a 20-node cluster (i2.4xlarge) with vnodes (default of 256) and 
> 500-700GB per node.  SSTable counts are <10 per table.  I am attempting to 
> provision additional nodes, but bootstrapping OOMs every time after about 10 
> hours with a sudden long GC pause:
> {noformat}
> INFO  [Service Thread] 2015-10-05 23:33:33,373 GCInspector.java:252 - G1 Old 
> Generation GC in 1586126ms.  G1 Old Gen: 49213756976 -> 49072277176;
> ...
> ERROR [MemtableFlushWriter:454] 2015-10-05 23:33:33,380 
> CassandraDaemon.java:223 - Exception in thread 
> Thread[MemtableFlushWriter:454,5,main]
> java.lang.OutOfMemoryError: Java heap space
> {noformat}
> I have tried increasing max heap to 48G just to get through the bootstrap, to 
> no avail.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-10449) OOM on bootstrap after long GC pause

2016-05-11 Thread Paulo Motta (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-10449?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15280512#comment-15280512
 ] 

Paulo Motta commented on CASSANDRA-10449:
-

Looking back at the heap dump this was clearly caused by CASSANDRA-10680, so 
I'm closing this as a duplicate of that.

> OOM on bootstrap after long GC pause
> 
>
> Key: CASSANDRA-10449
> URL: https://issues.apache.org/jira/browse/CASSANDRA-10449
> Project: Cassandra
>  Issue Type: Bug
> Environment: Ubuntu 14.04, AWS
>Reporter: Robbie Strickland
>  Labels: gc
> Fix For: 2.1.x
>
> Attachments: GCpath.txt, heap_dump.png, system.log.10-05, 
> thread_dump.log, threads.txt
>
>
> I have a 20-node cluster (i2.4xlarge) with vnodes (default of 256) and 
> 500-700GB per node.  SSTable counts are <10 per table.  I am attempting to 
> provision additional nodes, but bootstrapping OOMs every time after about 10 
> hours with a sudden long GC pause:
> {noformat}
> INFO  [Service Thread] 2015-10-05 23:33:33,373 GCInspector.java:252 - G1 Old 
> Generation GC in 1586126ms.  G1 Old Gen: 49213756976 -> 49072277176;
> ...
> ERROR [MemtableFlushWriter:454] 2015-10-05 23:33:33,380 
> CassandraDaemon.java:223 - Exception in thread 
> Thread[MemtableFlushWriter:454,5,main]
> java.lang.OutOfMemoryError: Java heap space
> {noformat}
> I have tried increasing max heap to 48G just to get through the bootstrap, to 
> no avail.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-10449) OOM on bootstrap after long GC pause

2015-10-26 Thread Robbie Strickland (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-10449?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14974716#comment-14974716
 ] 

Robbie Strickland commented on CASSANDRA-10449:
---

As a workaround I was able to simply restart the node with {{auto_bootstrap}} 
set to false, which allowed it to successfully join.  Obviously there appear to 
be multiple issues here, as the behavior in 2.1.7 and 2.1.11 is different with 
an otherwise identical setup.

> OOM on bootstrap after long GC pause
> 
>
> Key: CASSANDRA-10449
> URL: https://issues.apache.org/jira/browse/CASSANDRA-10449
> Project: Cassandra
>  Issue Type: Bug
>  Components: Core
> Environment: Ubuntu 14.04, AWS
>Reporter: Robbie Strickland
>  Labels: gc
> Fix For: 2.1.x
>
> Attachments: GCpath.txt, heap_dump.png, system.log.10-05, 
> thread_dump.log, threads.txt
>
>
> I have a 20-node cluster (i2.4xlarge) with vnodes (default of 256) and 
> 500-700GB per node.  SSTable counts are <10 per table.  I am attempting to 
> provision additional nodes, but bootstrapping OOMs every time after about 10 
> hours with a sudden long GC pause:
> {noformat}
> INFO  [Service Thread] 2015-10-05 23:33:33,373 GCInspector.java:252 - G1 Old 
> Generation GC in 1586126ms.  G1 Old Gen: 49213756976 -> 49072277176;
> ...
> ERROR [MemtableFlushWriter:454] 2015-10-05 23:33:33,380 
> CassandraDaemon.java:223 - Exception in thread 
> Thread[MemtableFlushWriter:454,5,main]
> java.lang.OutOfMemoryError: Java heap space
> {noformat}
> I have tried increasing max heap to 48G just to get through the bootstrap, to 
> no avail.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-10449) OOM on bootstrap after long GC pause

2015-10-22 Thread Robbie Strickland (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-10449?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14969215#comment-14969215
 ] 

Robbie Strickland commented on CASSANDRA-10449:
---

Also, for reference, tpstats shows nothing in the queues:

{noformat}
ubuntu@eventcass4x087:~$ nodetool tpstats
Pool NameActive   Pending  Completed   Blocked  All 
time blocked
MutationStage 0 0   85431226 0  
   0
ReadStage 0 0  0 0  
   0
RequestResponseStage  0 0 48 0  
   0
ReadRepairStage   0 0  0 0  
   0
CounterMutationStage  0 0  0 0  
   0
MiscStage 0 0  0 0  
   0
HintedHandoff 0 0 29 0  
   0
GossipStage   0 0 565556 0  
   0
CacheCleanupExecutor  0 0  0 0  
   0
InternalResponseStage 0 0  0 0  
   0
CommitLogArchiver 0 0  0 0  
   0
CompactionExecutor0 0  12774 0  
   0
ValidationExecutor0 0  0 0  
   0
MigrationStage0 0  0 0  
   0
AntiEntropyStage  0 0  0 0  
   0
PendingRangeCalculator0 0  3 0  
   0
Sampler   0 0  0 0  
   0
MemtableFlushWriter   0 0   7157 0  
   0
MemtablePostFlush 0 0  10083 0  
   0
MemtableReclaimMemory 0 0   9340 0  
   0

Message type   Dropped
READ 0
RANGE_SLICE  0
_TRACE   0
MUTATION 0
COUNTER_MUTATION 0
BINARY   0
REQUEST_RESPONSE 0
PAGED_RANGE  0
READ_REPAIR  0
{noformat}

> OOM on bootstrap after long GC pause
> 
>
> Key: CASSANDRA-10449
> URL: https://issues.apache.org/jira/browse/CASSANDRA-10449
> Project: Cassandra
>  Issue Type: Bug
>  Components: Core
> Environment: Ubuntu 14.04, AWS
>Reporter: Robbie Strickland
>  Labels: gc
> Fix For: 2.1.x
>
> Attachments: GCpath.txt, heap_dump.png, system.log.10-05, 
> thread_dump.log, threads.txt
>
>
> I have a 20-node cluster (i2.4xlarge) with vnodes (default of 256) and 
> 500-700GB per node.  SSTable counts are <10 per table.  I am attempting to 
> provision additional nodes, but bootstrapping OOMs every time after about 10 
> hours with a sudden long GC pause:
> {noformat}
> INFO  [Service Thread] 2015-10-05 23:33:33,373 GCInspector.java:252 - G1 Old 
> Generation GC in 1586126ms.  G1 Old Gen: 49213756976 -> 49072277176;
> ...
> ERROR [MemtableFlushWriter:454] 2015-10-05 23:33:33,380 
> CassandraDaemon.java:223 - Exception in thread 
> Thread[MemtableFlushWriter:454,5,main]
> java.lang.OutOfMemoryError: Java heap space
> {noformat}
> I have tried increasing max heap to 48G just to get through the bootstrap, to 
> no avail.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-10449) OOM on bootstrap after long GC pause

2015-10-22 Thread Robbie Strickland (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-10449?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14969125#comment-14969125
 ] 

Robbie Strickland commented on CASSANDRA-10449:
---

I decided to try upgrading to 2.1.11 to see if the issue was resolved by 
CASSANDRA-9681.  The node has been joining for over 24 hours, even though it 
appears to have finished streaming after about 6 hours:

{{noformat}}
ubuntu@eventcass4x087:~$ nodetool netstats | grep -v 100%
Mode: JOINING
Bootstrap 7047c510-7732-11e5-a7e7-63f53bbd2778
Receiving 171 files, 95313491312 bytes total. Already received 171 
files, 95313491312 bytes total
Receiving 165 files, 78860134041 bytes total. Already received 165 
files, 78860134041 bytes total
Receiving 158 files, 77709354374 bytes total. Already received 158 
files, 77709354374 bytes total
Receiving 184 files, 106710570690 bytes total. Already received 184 
files, 106710570690 bytes total
Receiving 136 files, 35699286217 bytes total. Already received 136 
files, 35699286217 bytes total
Receiving 169 files, 53498180215 bytes total. Already received 169 
files, 53498180215 bytes total
Receiving 197 files, 129020987979 bytes total. Already received 197 
files, 129020987979 bytes total
Receiving 196 files, 113904035360 bytes total. Already received 196 
files, 113904035360 bytes total
Receiving 172 files, 47685647028 bytes total. Already received 172 
files, 47685647028 bytes total
Read Repair Statistics:
Attempted: 0
Mismatch (Blocking): 0
Mismatch (Background): 0
Pool NameActive   Pending  Completed
Commandsn/a 1  0
Responses   n/a 0   83743675
{{noformat}}

It doesn't appear to still be building indexes either:

{{noformat}}
ubuntu@eventcass4x087:~$ nodetool compactionstats
pending tasks: 2
   compaction typekeyspace  table   completed   total   
 unit   progress
Compaction   prod_analytics_events   wuevents   163704673   201033961   
bytes 81.43%
Active compaction remaining time :n/a
{{noformat}}

So I'm not sure why it's still joining.  Any thoughts?

> OOM on bootstrap after long GC pause
> 
>
> Key: CASSANDRA-10449
> URL: https://issues.apache.org/jira/browse/CASSANDRA-10449
> Project: Cassandra
>  Issue Type: Bug
>  Components: Core
> Environment: Ubuntu 14.04, AWS
>Reporter: Robbie Strickland
>  Labels: gc
> Fix For: 2.1.x
>
> Attachments: GCpath.txt, heap_dump.png, system.log.10-05, 
> thread_dump.log, threads.txt
>
>
> I have a 20-node cluster (i2.4xlarge) with vnodes (default of 256) and 
> 500-700GB per node.  SSTable counts are <10 per table.  I am attempting to 
> provision additional nodes, but bootstrapping OOMs every time after about 10 
> hours with a sudden long GC pause:
> {noformat}
> INFO  [Service Thread] 2015-10-05 23:33:33,373 GCInspector.java:252 - G1 Old 
> Generation GC in 1586126ms.  G1 Old Gen: 49213756976 -> 49072277176;
> ...
> ERROR [MemtableFlushWriter:454] 2015-10-05 23:33:33,380 
> CassandraDaemon.java:223 - Exception in thread 
> Thread[MemtableFlushWriter:454,5,main]
> java.lang.OutOfMemoryError: Java heap space
> {noformat}
> I have tried increasing max heap to 48G just to get through the bootstrap, to 
> no avail.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-10449) OOM on bootstrap after long GC pause

2015-10-16 Thread Mikhail Stepura (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-10449?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14961340#comment-14961340
 ] 

Mikhail Stepura commented on CASSANDRA-10449:
-

Any chance to get the dump file itself? Of course if it doesn't contain any 
sensible information.

> OOM on bootstrap after long GC pause
> 
>
> Key: CASSANDRA-10449
> URL: https://issues.apache.org/jira/browse/CASSANDRA-10449
> Project: Cassandra
>  Issue Type: Bug
>  Components: Core
> Environment: Ubuntu 14.04, AWS
>Reporter: Robbie Strickland
>  Labels: gc
> Fix For: 2.1.x
>
> Attachments: heap_dump.png, system.log.10-05, thread_dump.log
>
>
> I have a 20-node cluster (i2.4xlarge) with vnodes (default of 256) and 
> 500-700GB per node.  SSTable counts are <10 per table.  I am attempting to 
> provision additional nodes, but bootstrapping OOMs every time after about 10 
> hours with a sudden long GC pause:
> {noformat}
> INFO  [Service Thread] 2015-10-05 23:33:33,373 GCInspector.java:252 - G1 Old 
> Generation GC in 1586126ms.  G1 Old Gen: 49213756976 -> 49072277176;
> ...
> ERROR [MemtableFlushWriter:454] 2015-10-05 23:33:33,380 
> CassandraDaemon.java:223 - Exception in thread 
> Thread[MemtableFlushWriter:454,5,main]
> java.lang.OutOfMemoryError: Java heap space
> {noformat}
> I have tried increasing max heap to 48G just to get through the bootstrap, to 
> no avail.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-10449) OOM on bootstrap after long GC pause

2015-10-16 Thread Robbie Strickland (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-10449?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14961343#comment-14961343
 ] 

Robbie Strickland commented on CASSANDRA-10449:
---

Yes, sorry I was working on getting it to S3.  You can get it 
[here|https://s3.amazonaws.com/twc-analytics-public/java_1445001330.hprof].

> OOM on bootstrap after long GC pause
> 
>
> Key: CASSANDRA-10449
> URL: https://issues.apache.org/jira/browse/CASSANDRA-10449
> Project: Cassandra
>  Issue Type: Bug
>  Components: Core
> Environment: Ubuntu 14.04, AWS
>Reporter: Robbie Strickland
>  Labels: gc
> Fix For: 2.1.x
>
> Attachments: heap_dump.png, system.log.10-05, thread_dump.log
>
>
> I have a 20-node cluster (i2.4xlarge) with vnodes (default of 256) and 
> 500-700GB per node.  SSTable counts are <10 per table.  I am attempting to 
> provision additional nodes, but bootstrapping OOMs every time after about 10 
> hours with a sudden long GC pause:
> {noformat}
> INFO  [Service Thread] 2015-10-05 23:33:33,373 GCInspector.java:252 - G1 Old 
> Generation GC in 1586126ms.  G1 Old Gen: 49213756976 -> 49072277176;
> ...
> ERROR [MemtableFlushWriter:454] 2015-10-05 23:33:33,380 
> CassandraDaemon.java:223 - Exception in thread 
> Thread[MemtableFlushWriter:454,5,main]
> java.lang.OutOfMemoryError: Java heap space
> {noformat}
> I have tried increasing max heap to 48G just to get through the bootstrap, to 
> no avail.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-10449) OOM on bootstrap after long GC pause

2015-10-16 Thread Mikhail Stepura (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-10449?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14961610#comment-14961610
 ] 

Mikhail Stepura commented on CASSANDRA-10449:
-

What was the heap size for the dump? 16G?

> OOM on bootstrap after long GC pause
> 
>
> Key: CASSANDRA-10449
> URL: https://issues.apache.org/jira/browse/CASSANDRA-10449
> Project: Cassandra
>  Issue Type: Bug
>  Components: Core
> Environment: Ubuntu 14.04, AWS
>Reporter: Robbie Strickland
>  Labels: gc
> Fix For: 2.1.x
>
> Attachments: heap_dump.png, system.log.10-05, thread_dump.log
>
>
> I have a 20-node cluster (i2.4xlarge) with vnodes (default of 256) and 
> 500-700GB per node.  SSTable counts are <10 per table.  I am attempting to 
> provision additional nodes, but bootstrapping OOMs every time after about 10 
> hours with a sudden long GC pause:
> {noformat}
> INFO  [Service Thread] 2015-10-05 23:33:33,373 GCInspector.java:252 - G1 Old 
> Generation GC in 1586126ms.  G1 Old Gen: 49213756976 -> 49072277176;
> ...
> ERROR [MemtableFlushWriter:454] 2015-10-05 23:33:33,380 
> CassandraDaemon.java:223 - Exception in thread 
> Thread[MemtableFlushWriter:454,5,main]
> java.lang.OutOfMemoryError: Java heap space
> {noformat}
> I have tried increasing max heap to 48G just to get through the bootstrap, to 
> no avail.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-10449) OOM on bootstrap after long GC pause

2015-10-16 Thread Robbie Strickland (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-10449?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14961661#comment-14961661
 ] 

Robbie Strickland commented on CASSANDRA-10449:
---

Yes, 16GB.

> OOM on bootstrap after long GC pause
> 
>
> Key: CASSANDRA-10449
> URL: https://issues.apache.org/jira/browse/CASSANDRA-10449
> Project: Cassandra
>  Issue Type: Bug
>  Components: Core
> Environment: Ubuntu 14.04, AWS
>Reporter: Robbie Strickland
>  Labels: gc
> Fix For: 2.1.x
>
> Attachments: heap_dump.png, system.log.10-05, thread_dump.log
>
>
> I have a 20-node cluster (i2.4xlarge) with vnodes (default of 256) and 
> 500-700GB per node.  SSTable counts are <10 per table.  I am attempting to 
> provision additional nodes, but bootstrapping OOMs every time after about 10 
> hours with a sudden long GC pause:
> {noformat}
> INFO  [Service Thread] 2015-10-05 23:33:33,373 GCInspector.java:252 - G1 Old 
> Generation GC in 1586126ms.  G1 Old Gen: 49213756976 -> 49072277176;
> ...
> ERROR [MemtableFlushWriter:454] 2015-10-05 23:33:33,380 
> CassandraDaemon.java:223 - Exception in thread 
> Thread[MemtableFlushWriter:454,5,main]
> java.lang.OutOfMemoryError: Java heap space
> {noformat}
> I have tried increasing max heap to 48G just to get through the bootstrap, to 
> no avail.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-10449) OOM on bootstrap after long GC pause

2015-10-16 Thread Mikhail Stepura (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-10449?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14961750#comment-14961750
 ] 

Mikhail Stepura commented on CASSANDRA-10449:
-

CASSANDRA-9681?

> OOM on bootstrap after long GC pause
> 
>
> Key: CASSANDRA-10449
> URL: https://issues.apache.org/jira/browse/CASSANDRA-10449
> Project: Cassandra
>  Issue Type: Bug
>  Components: Core
> Environment: Ubuntu 14.04, AWS
>Reporter: Robbie Strickland
>  Labels: gc
> Fix For: 2.1.x
>
> Attachments: GCpath.txt, heap_dump.png, system.log.10-05, 
> thread_dump.log, threads.txt
>
>
> I have a 20-node cluster (i2.4xlarge) with vnodes (default of 256) and 
> 500-700GB per node.  SSTable counts are <10 per table.  I am attempting to 
> provision additional nodes, but bootstrapping OOMs every time after about 10 
> hours with a sudden long GC pause:
> {noformat}
> INFO  [Service Thread] 2015-10-05 23:33:33,373 GCInspector.java:252 - G1 Old 
> Generation GC in 1586126ms.  G1 Old Gen: 49213756976 -> 49072277176;
> ...
> ERROR [MemtableFlushWriter:454] 2015-10-05 23:33:33,380 
> CassandraDaemon.java:223 - Exception in thread 
> Thread[MemtableFlushWriter:454,5,main]
> java.lang.OutOfMemoryError: Java heap space
> {noformat}
> I have tried increasing max heap to 48G just to get through the bootstrap, to 
> no avail.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)