i would expect read latency to increase linearly w/ the number of
sstables you have around.  how many are in your data directories?  is
your compaction lagging 1000s of tables behind again?

On Thu, Dec 3, 2009 at 12:58 PM, Freeman, Tim <tim.free...@hp.com> wrote:
> I ran another test last night with the build dated 29 Nov 2009.  Other than 
> the Cassandra version, the setup was the same as before.  I got qualitatively 
> similar results as before, too -- the read latency increased fairly smoothly 
> from 250ms to 1s, the GC times reported by jconsole are low, the pending 
> tasks for row-mutation-stage and row-read-stage are less than 10, the pending 
> tasks for the compaction pool are 1615.  Last time around the read latency 
> maxed out at one second.  This time, it just got to one second as I'm writing 
> this so I don't know yet if it will continue to increase.
>
> I have attached a fresh graph describing the present run.  It's qualitatively 
> similar to the previous one.  The vertical units are milliseconds (for 
> latency) and operations per minute (for reads or writes).  The horizontal 
> scale is seconds.  The feature that's bothering me is the red line for the 
> read latency going diagonally from lower left to the lower-middle right.  The 
> scale doesn't make it look dramatic, but Cassandra slowed down by a factor of 
> 4.
>
> The read and write rates were stable for 45,000 seconds or so, and then the 
> read latency got big enough that the application was starved for reads and it 
> started writing less.
>
> If this is worth pursuing, I suppose the next step would be for me to make a 
> small program that reproduces the problem.  It should be easy -- we're just 
> reading and writing random records.  Let me know if there's interest in that. 
>  I could  also decide to live with a 1000 ms latency here.  I'm thinking of 
> putting a cache in the local filesystem in front of Cassandra (or whichever 
> distributed DB we decide to go with), so living with it is definitely 
> possible.
>
> Tim Freeman
> Email: tim.free...@hp.com
> Desk in Palo Alto: (650) 857-2581
> Home: (408) 774-1298
> Cell: (408) 348-7536 (No reception business hours Monday, Tuesday, and 
> Thursday; call my desk instead.)
>
> -----Original Message-----
> From: Jonathan Ellis [mailto:jbel...@gmail.com]
> Sent: Tuesday, December 01, 2009 11:10 AM
> To: cassandra-user@incubator.apache.org
> Subject: Re: Persistently increasing read latency
>
> 1) use jconsole to see what is happening to jvm / cassandra internals.
>  possibly you are slowly exceeding cassandra's ability to keep up with
> writes, causing the jvm to spend more and more effort GCing to find
> enough memory to keep going
>
> 2) you should be at least on 0.4.2 and preferably trunk if you are
> stress testing
>
> -Jonathan
>
> On Tue, Dec 1, 2009 at 12:11 PM, Freeman, Tim <tim.free...@hp.com> wrote:
>> In an 8 hour test run, I've seen the read latency for Cassandra drift fairly 
>> linearly from ~460ms to ~900ms.  Eventually my application gets starved for 
>> reads and starts misbehaving.  I have attached graphs -- horizontal scales 
>> are seconds, vertical scales are operations per minute and average 
>> milliseconds per operation.  The clearest feature is the light blue line in 
>> the left graph drifting consistently upward during the run.
>>
>> I have a Cassandra 0.4.1 database, one node, records are 100kbytes each, 
>> 350K records, 8 threads reading, around 700 reads per minute.  There are 
>> also 8 threads writing.  This is all happening on a 4 core processor that's 
>> supporting both the Cassandra node and the code that's generating load for 
>> it.  I'm reasonably sure that there are no page faults.
>>
>> I have attached my storage-conf.xml.  Briefly, it has default values, except 
>> RpcTimeoutInMillis is 30000 and the partitioner is 
>> OrderPreservingPartitioner.  Cassandra's garbage collection parameters are:
>>
>>   -Xms128m -Xmx1G -XX:SurvivorRatio=8 -XX:+AggressiveOpts -XX:+UseParNewGC 
>> -XX:+UseConcMarkSweepGC -XX:+CMSParallelRemarkEnabled
>>
>> Is this normal behavior?  Is there some change to the configuration I should 
>> make to get it to stop getting slower?  If it's not normal, what debugging 
>> information should I gather?  Should I give up on Cassandra 0.4.1 and move 
>> to a newer version?
>>
>> I'll leave it running for the time being in case there's something useful to 
>> extract from it.
>>
>> Tim Freeman
>> Email: tim.free...@hp.com
>> Desk in Palo Alto: (650) 857-2581
>> Home: (408) 774-1298
>> Cell: (408) 348-7536 (No reception business hours Monday, Tuesday, and 
>> Thursday; call my desk instead.)
>>
>>
>

Reply via email to