Yes, using a larger heap than you need can be bad from a GC latency point of view.
Upgrading to 0.4.2 will also help since we have better default GC options. On Mon, Nov 16, 2009 at 12:13 PM, Chris Were <chris.w...@gmail.com> wrote: > Hi Tim, > Thanks for the great pointers. > si, so are regularly in the 100-2000 range. I'll need to Google more about > what these mean etc, but are you effectively saying to tell cassandra to use > less memory? Cassandra is the only Java App running on the server. > Cheers, > Chris > > On Mon, Nov 16, 2009 at 9:59 AM, Freeman, Tim <tim.free...@hp.com> wrote: >> >> I'm running 0.4.1. I used to get timeouts, then I changed my timeout from >> 5 seconds to 30 seconds and I get no more timeouts. The relevant line from >> storage-conf.xml is: >> >> >> >> <RpcTimeoutInMillis>30000</RpcTimeoutInMillis> >> >> >> >> The maximum latency is often just over 5 seconds in the worst case when I >> fetch thousands of records, so default timeout of 5 seconds happens to be a >> little bit too low for me. My records are ~100Kbytes each. You may get >> different results if your records are much larger or much smaller. >> >> >> >> The other issue I was having a few days ago was that the machine was page >> faulting so garbage collections were taking forever. Some GC's took 20 >> minutes in another Java process. I didn't have verbose:gc turned on in >> Cassandra so I'm not sure what the score was there, but there's little >> reason to expect it to be qualitatively better, since it's pretty random >> which process gets some of its pages swapped out. On a Linux machine, run >> "vmstat 5" when your machine is loaded and if you see numbers greater than 0 >> in the "si" and "so" columns in rows after the first, tell one of your Java >> processes to take less memory. >> >> >> >> Tim Freeman >> Email: tim.free...@hp.com >> Desk in Palo Alto: (650) 857-2581 >> Home: (408) 774-1298 >> Cell: (408) 348-7536 (No reception business hours Monday, Tuesday, and >> Thursday; call my desk instead.) >> >> >> >> From: Chris Were [mailto:chris.w...@gmail.com] >> Sent: Monday, November 16, 2009 9:47 AM >> To: Jonathan Ellis >> Cc: cassandra-user@incubator.apache.org >> Subject: Re: Timeout Exception >> >> >> >> I turned on debug logging for a few days and timeouts happened across >> pretty much all requests. I couldn't see any particular request that was >> consistently the problem. >> >> >> >> After some experimenting it seems that shutting down cassandra and >> restarting resolves the problem. Once it hits the JVM memory limit however, >> the timeouts start again. I have read the page on MemTable thresholds and >> have tried thresholds of 32MB, 64MB and 128MB with no noticeable difference. >> Cassandra is set to use 7GB of memory. I have 12 CF's, however only 6 of >> those have lots of data. >> >> >> >> Cheers, >> >> Chris >> >> On Tue, Nov 10, 2009 at 11:55 AM, Jonathan Ellis <jbel...@gmail.com> >> wrote: >> >> if you're timing out doing a slice on 10 columns w/ 10% cpu used, >> something is broken >> >> is it consistent as to which keys this happens on? try turning on >> debug logging and seeing where the latency is coming from. >> >> On Tue, Nov 10, 2009 at 1:53 PM, Chris Were <chris.w...@gmail.com> wrote: >> > >> > On Tue, Nov 10, 2009 at 11:50 AM, Jonathan Ellis <jbel...@gmail.com> >> > wrote: >> >> >> >> On Tue, Nov 10, 2009 at 1:49 PM, Chris Were <chris.w...@gmail.com> >> >> wrote: >> >> > Maybe... but it's not just multigets, it also happens when retreiving >> >> > one >> >> > row with get_slice. >> >> >> >> how many of the 3M columns are you trying to slice at once? >> > >> > Sorry, I must have mixed up the terminology. >> > There's ~3M keys, but less than 10 columns in each. The get_slice calls >> > are >> > to retreive all the columns (10) for a given key. >> >> >