Re: cassandra OOM

Gopal, Dhruva Mon, 03 Apr 2017 13:54:57 -0700

16 Gig heap, with G1. Pertinent info from jvm.options below (we’re using 
m2.2xlarge instances in AWS):



#################
# HEAP SETTINGS #
#################

# Heap size is automatically calculated by cassandra-env based on this
# formula: max(min(1/2 ram, 1024MB), min(1/4 ram, 8GB))
# That is:
# - calculate 1/2 ram and cap to 1024MB
# - calculate 1/4 ram and cap to 8192MB
# - pick the max
#
# For production use you may wish to adjust this for your environment.
# If that's the case, uncomment the -Xmx and Xms options below to override the
# automatic calculation of JVM heap memory.
#
# It is recommended to set min (-Xms) and max (-Xmx) heap sizes to
# the same value to avoid stop-the-world GC pauses during resize, and
# so that we can lock the heap in memory on startup to prevent any
# of it from being swapped out.
-Xms16G
-Xmx16G

# Young generation size is automatically calculated by cassandra-env
# based on this formula: min(100 * num_cores, 1/4 * heap size)
#
# The main trade-off for the young generation is that the larger it
# is, the longer GC pause times will be. The shorter it is, the more
# expensive GC will be (usually).
#
# It is not recommended to set the young generation size if using the
# G1 GC, since that will override the target pause-time goal.
# More info: http://www.oracle.com/technetwork/articles/java/g1gc-1984535.html
#
# The example below assumes a modern 8-core+ machine for decent
# times. If in doubt, and if you do not particularly want to tweak, go
# 100 MB per physical CPU core.
#-Xmn800M

#################
#  GC SETTINGS  #
#################

### CMS Settings

#-XX:+UseParNewGC
#-XX:+UseConcMarkSweepGC
#-XX:+CMSParallelRemarkEnabled
#-XX:SurvivorRatio=8
#-XX:MaxTenuringThreshold=1
#-XX:CMSInitiatingOccupancyFraction=75
#-XX:+UseCMSInitiatingOccupancyOnly
#-XX:CMSWaitDuration=10000
#-XX:+CMSParallelInitialMarkEnabled
#-XX:+CMSEdenChunksRecordAlways
# some JVMs will fill up their heap when accessed via JMX, see CASSANDRA-6541
#-XX:+CMSClassUnloadingEnabled

### G1 Settings (experimental, comment previous section and uncomment section 
below to enable)

## Use the Hotspot garbage-first collector.
-XX:+UseG1GC
#
## Have the JVM do less remembered set work during STW, instead
## preferring concurrent GC. Reduces p99.9 latency.
-XX:G1RSetUpdatingPauseTimePercent=5
#
## Main G1GC tunable: lowering the pause target will lower throughput and vise 
versa.
## 200ms is the JVM default and lowest viable setting
## 1000ms increases throughput. Keep it smaller than the timeouts in 
cassandra.yaml.
-XX:MaxGCPauseMillis=500

## Optional G1 Settings

# Save CPU time on large (>= 16GB) heaps by delaying region scanning
# until the heap is 70% full. The default in Hotspot 8u40 is 40%.
-XX:InitiatingHeapOccupancyPercent=70

# For systems with > 8 cores, the default ParallelGCThreads is 5/8 the number 
of logical cores.
# Otherwise equal to the number of cores when 8 or less.
# Machines with > 10 cores should try setting these to <= full cores.
#-XX:ParallelGCThreads=16
# By default, ConcGCThreads is 1/4 of ParallelGCThreads.
# Setting both to the same value can reduce STW durations.
#-XX:ConcGCThreads=16

### GC logging options -- uncomment to enable

#-XX:+PrintGCDetails
#-XX:+PrintGCDateStamps
#-XX:+PrintHeapAtGC
#-XX:+PrintTenuringDistribution
#-XX:+PrintGCApplicationStoppedTime
#-XX:+PrintPromotionFailure
#-XX:PrintFLSStatistics=1
#-Xloggc:/var/log/cassandra/gc.log
#-XX:+UseGCLogFileRotation
#-XX:NumberOfGCLogFiles=10
#-XX:GCLogFileSize=10M


From: Alexander Dejanovski <a...@thelastpickle.com>
Reply-To: "user@cassandra.apache.org" <user@cassandra.apache.org>
Date: Monday, April 3, 2017 at 8:00 AM
To: "user@cassandra.apache.org" <user@cassandra.apache.org>
Subject: Re: cassandra OOM

Hi,

could you share your GC settings ? G1 or CMS ? Heap size, etc...

Thanks,

On Sun, Apr 2, 2017 at 10:30 PM Gopal, Dhruva 
<dhruva.go...@aspect.com<mailto:dhruva.go...@aspect.com>> wrote:
Hi –
  We’ve had what looks like an OOM situation with Cassandra (we have a dump 
file that got generated) in our staging (performance/load testing environment) 
and I wanted to reach out to this user group to see if you had any 
recommendations on how we should approach our investigation as to the cause of 
this issue. The logs don’t seem to point to any obvious issues, and we’re no 
experts in analyzing this by any means, so was looking for guidance on how to 
proceed. Should we enter a Jira as well? We’re on Cassandra 3.9, and are 
running  a six node cluster. This happened in a controlled load testing 
environment. Feedback will be much appreciated!


Regards,
Dhruva

This email (including any attachments) is proprietary to Aspect Software, Inc. 
and may contain information that is confidential. If you have received this 
message in error, please do not read, copy or forward this message. Please 
notify the sender immediately, delete it from your system and destroy any 
copies. You may not further disclose or distribute this email or its 
attachments.
--
-----------------
Alexander Dejanovski
France
@alexanderdeja

Consultant
Apache Cassandra Consulting
http://www.thelastpickle.com<http://www.thelastpickle.com/>
This email (including any attachments) is proprietary to Aspect Software, Inc. 
and may contain information that is confidential. If you have received this 
message in error, please do not read, copy or forward this message. Please 
notify the sender immediately, delete it from your system and destroy any 
copies. You may not further disclose or distribute this email or its 
attachments.

Re: cassandra OOM

Reply via email to