Hi Adam, We have been facing some similar issues of late. Wondering if Jonathan's suggestions worked for you. Thanks!
On Sat, May 7, 2011 at 6:37 PM, Jonathan Ellis <jbel...@gmail.com> wrote: > The live:serialized size ratio depends on what your data looks like > (small columns will be less efficient than large blobs) but using the > rule of thumb of 10x, around 1G * (1 + memtable_flush_writers + > memtable_flush_queue_size). > > So first thing I would do is drop writers and queue to 1 and 1. > > Then I would drop the max heap to 1G, memtable size to 8MB so the heap > dump is easier to analyze. Then let it OOM and look at the dump with > http://www.eclipse.org/mat/ > > On Sat, May 7, 2011 at 3:54 PM, Serediuk, Adam > <adam.sered...@serialssolutions.com> wrote: > > How much memory should a single hot cf with a 128mb memtable take with > row and key caching disabled during read? > > > > Because I'm seeing heap go from 3.5gb skyrocketing straight to max > (regardless of the size, 8gb and 24gb both do the same) at which time the > jvm will do nothing but full gc and is unable to reclaim any meaningful > amount of memory. Cassandra then becomes unusable. > > > > I see the same behavior with smaller memtables, eg 64mb. > > > > This happens well into the read operation an only on a small number of > nodes in the cluster(1-4 out of a total of 60 nodes.) > > > > Sent from my iPhone > > > > On May 6, 2011, at 22:45, "Jonathan Ellis" <jbel...@gmail.com> wrote: > > > >> You don't GC storm without legitimately having a too-full heap. It's > >> normal to see occasional full GCs from fragmentation, but that will > >> actually compact the heap and everything goes back to normal IF you > >> had space actually freed up. > >> > >> You say you've played w/ memtable size but that would still be my bet. > >> Most people severely underestimate how much space this takes (10x in > >> memory over serialized size), which will bite you when you have lots > >> of CFs defined. > >> > >> Otherwise, force a heap dump after a full GC and take a look to see > >> what's referencing all the memory. > >> > >> On Fri, May 6, 2011 at 12:25 PM, Serediuk, Adam > >> <adam.sered...@serialssolutions.com> wrote: > >>> We're troubleshooting a memory usage problem during batch reads. We've > spent the last few days profiling and trying different GC settings. The > symptoms are that after a certain amount of time during reads one or more > nodes in the cluster will exhibit extreme memory pressure followed by a gc > storm. We've tried every possible JVM setting and different GC methods and > the issue persists. This is pointing towards something instantiating a lot > of objects and keeping references so that they can't be cleaned up. > >>> > >>> Typically nothing is ever logged other than the GC failures however > just now one of the nodes emitted logs we've never seen before: > >>> > >>> INFO [ScheduledTasks:1] 2011-05-06 15:04:55,085 StorageService.java > (line 2218) Unable to reduce heap usage since there are no dirty column > families > >>> > >>> We have tried increasing the heap on these nodes to large values, eg > 24GB and still run into the same issue. We're running 8GB of heap normally > and only one or two nodes will ever exhibit this issue, randomly. We don't > use key/row caching and our memtable sizing is 64mb/0.3. Larger or smaller > memtables make no difference in avoiding the issue. We're on 0.7.5, mmap, > jna and jdk 1.6.0_24 > >>> > >>> We've somewhat hit the wall in troubleshooting and any advice is > greatly appreciated. > >>> > >>> -- > >>> Adam > >>> > >> > >> > >> > >> -- > >> Jonathan Ellis > >> Project Chair, Apache Cassandra > >> co-founder of DataStax, the source for professional Cassandra support > >> http://www.datastax.com > >> > > > > > > > > -- > Jonathan Ellis > Project Chair, Apache Cassandra > co-founder of DataStax, the source for professional Cassandra support > http://www.datastax.com >