@Todd, I noticed some new ops in your cassandra.in.sh. Is there any
documentation on what these ops are, and what they do?

For instance AggressiveOpts, etc.

On Mon, Jul 26, 2010 at 4:33 PM, B. Todd Burruss <bburr...@real.com> wrote:

> i run cassandra with a 30gb heap on machines with 48gb total with good
> results.  i don't use more just because i want to leave some for the OS
> to cache disk pages, etc.  i did have the problem a couple of times with
> GC doing a full stop on the JVM because it couldn't keep up.  my
> understanding of the CMS GC is that it kicks in when a certain
> percentage of the JVM heap is used.  by tweaking
> XX:CMSInitiatingOccupancyFraction you can make this kick in sooner (or
> later) and this fixed it for me.
> my JVM opts differ just slightly from the latest cassandra changes in
> 0.6
> JVM_OPTS=" \
>        -ea \
>        -Xms30G \
>        -Xmx30G \
>        -XX:SurvivorRatio=128 \
>        -XX:MaxTenuringThreshold=0 \
>        -XX:TargetSurvivorRatio=90 \
>        -XX:+AggressiveOpts \
>        -XX:+UseParNewGC \
>        -XX:+UseConcMarkSweepGC \
>        -XX:+CMSParallelRemarkEnabled \
>        -XX:CMSInitiatingOccupancyFraction=88 \
>        -XX:+HeapDumpOnOutOfMemoryError \
>        -XX:+PrintGCDetails -XX:+PrintGCTimeStamps -verbose:gc \
>        -Dnetworkaddress.cache.ttl=60 \
>        -Dcom.sun.management.jmxremote.port=6786 \
>        -Dcom.sun.management.jmxremote.ssl=false \
>        -Dcom.sun.management.jmxremote.authenticate=false \
> "
On Mon, 2010-07-26 at 14:04 -0700, Peter Schuller wrote:
> > > If the cache is stored in the heap, how big can the heap be made
> > > realistically on a 24gb ram machine? I am a java newbie but I have read
> > > concerns with going over 8gb for the heap as the GC can be too
> painful/take
> > > too long. I already have seen timeout issues (node is dead errors)
> under
> > > load during GC or compaction. Can/should the heap be set to 16gb with
> 24gb
> > > ram?
> >
> > I have never run Cassandra in production with such a large heap, so
> > I'll let others comment on practical experience with that.
> >
> > In general however, with the JVM and the CMS garbage collector (which
> > is enabled by default with Cassandra), having a large heap is not
> > necessarily a problem depending on the application's workload.
> >
> > In terms of GC:s taking too long - with the default throughput
> > collector used by the JVM you will tend to see the longest pause times
> > scale roughly linearly with heap size. Most pauses would still be
> > short (these are what is known as young generation collections), but
> > periodically a so-called full collection is done. WIth the throughput
> > collector, this implies stopping all Java threads while the *entire*
> > Java heap is garbage collected.
> >
> > WIth the CMS (Concurrent Mark/Sweep) collector the intent is that the
> > periodic scans of the entire Java heap are done concurrently with the
> > application without pausing it. Fallback to full stop-the-world
> > garbage collections can still happen if CMS fails to complete such
> > work fast enough, in which case tweaking of garbage collection
> > settings may be required.
> >
> > One thing to consider in any case is how much memory you actually
> > need; the more you give to the JVM, the less there is left for the OS
> > to cache file contents. If for example your true working set in
> > cassandra is, to grab a random number, 3 GB and you set the heap
> > sizeto 15 GB - now you're wasting a lot of memory by allowing the JVM
> > to postpone GC until it starts approaching the 15 GB mark. This is
> > actually good (normally) for overall GC throughput, but not
> > necessarily good overall for something like cassandra where there is a
> > direct trade-off with cache eviction in the operating system possibly
> > causing additional I/O.
> >
> > Personally I'd be very interested in hearing any stories about running
> > cassandra nodes with 10+ gig heap sizes, and how well it has worked.
> > My gut feeling is that it should work reasonable well, but I have no
> > evidence of that and I may very well be wrong. Anyone?
> >
> > (On a related noted, my limited testing with the G1 collector with
> > Cassandra has indicated it works pretty well. Though I'm concerned
> > with the weak ref finalization based cleanup of compacted sstables
> > since the G1 collector will be much less deterministic in when a
> > particular object may be collected. Has anyone deployed Cassandra with
> > G1 on very large heaps under real load?)
> >

