Re: Config changes to leverage new hardware

Aaron Morton Mon, 25 Nov 2013 11:56:23 -0800

> However, for both writes and reads there was virtually no difference in the 
> latencies.
What sort of latency were you getting ?


> I’m still not very sure where the current *write* bottleneck is though. 
What numbers are you getting ? 
Could the bottle neck be the client ? Can it send writes fast enough to 
saturate the nodes ?

As a rule of thumb you should get 3,000 to 4,000 (non counter) writes per 
second per core. 

> Sample iostat data (captured every 10s) for the dedicated disk where commit 
> logs are written is below. Does this seem like a bottle neck?
Does not look too bad. 

> Another interesting thing is that the linux disk cache doesn’t seem to be 
> growing in spite of a lot of free memory available. 
Things will only get paged in when they are accessed. 

Cheers


-----------------
Aaron Morton
New Zealand
@aaronmorton

Co-Founder & Principal Consultant
Apache Cassandra Consulting
http://www.thelastpickle.com

On 21/11/2013, at 12:42 pm, Arindam Barua <aba...@247-inc.com> wrote:

>  
> Thanks for the suggestions Aaron.
>  
> As a follow up, we ran a bunch of tests with different combinations of these 
> changes on a 2-node ring. The load was generated using cassandra-stress, run 
> with default values to write 30 million rows, and read them back.
> However, for both writes and reads there was virtually no difference in the 
> latencies.
>  
> The different combinations attempted:
> 1.       Baseline test with none of the below changes.
> 2.       Grabbing the TLAB setting from 1.2
> 3.       Moving the commit logs too to the 7 disk RAID 0.
> 4.       Increasing the concurrent_read to 32, and concurrent_write to 64
> 5.       (3) + (4), i.e. moving commit logs to the RAID + increasing 
> concurrent_read and concurrent_write config to 32 and 64.
>  
> The write latencies were very similar, except them being ~3x worse for the 
> 99.9th percentile and above for scenario (5) above.
> The read latencies were also similar, with (3) and (5) being a little worse 
> for the 99.99th percentile.
>  
> Overall, not making any changes, i.e. (1) performed as well or slightly 
> better than any of the other changes.
>  
> Running cassandra-stress on both the old and new hardware without making any 
> config changes, the write performance was very similar, but the new hardware 
> did show ~10x improvement in the read for the 99.9th percentile and higher. 
> After thinking about this, the reason why we were not seeing any difference 
> with our test framework was perhaps the nature of the test where we write the 
> rows, and then do a bunch of reads to read the rows that were just written 
> immediately following. The data is read back from the memtables, and never 
> from the disk/sstables. Hence the new hardware’s increased RAM and size of 
> the disk cache or higher number of disks never helps.
>  
> I’m still not very sure where the current *write* bottleneck is though. The 
> new hardware has 32 cores vs 8 cores of the old hardware. Moving the commit 
> log from a dedicated disk to a 7 RAID-0 disk system (where it would be shared 
> by other data though) didn’t make a difference too. (unless the extra 
> contention on the RAID nullified the positive effects of the RAID).
>  
> Sample iostat data (captured every 10s) for the dedicated disk where commit 
> logs are written is below. Does this seem like a bottle neck? When the commit 
> logs are written the await/svctm ratio is high.
>  
> Device:         rrqm/s   wrqm/s   r/s   w/s    rMB/s    wMB/s avgrq-sz 
> avgqu-sz   await  svctm  %util
>                0.00     8.09  0.04  8.85     0.00     0.07    15.74     0.00  
>   0.12   0.03   0.02
>                0.00   768.03  0.00  9.49     0.00     3.04   655.41     0.04  
>   4.52   0.33   0.31
>                0.00     8.10  0.04  8.85     0.00     0.07    15.75     0.00  
>   0.12   0.03   0.02
>                0.00   752.65  0.00 10.09     0.00     2.98   604.75     0.03  
>   3.00   0.26   0.26
>  
> Another interesting thing is that the linux disk cache doesn’t seem to be 
> growing in spite of a lot of free memory available. The total disk cache used 
> reported by ‘free’ is less than the size of the sstables written with over 
> 100 GB unused RAM.
> Even in production, where we have the older hardware running with 32 GB RAM 
> for a long time now, looking at 5 hosts in 1 DC, only 2.5 GB to 8 GB was used 
> for the disk cache. The Cassandra java process uses the 8 GB allocated to it, 
> and at least 10-15 GB on all the hosts is not used at all.
>  
> Thanks,
> Arindam
>  
> From: Aaron Morton [mailto:aa...@thelastpickle.com] 
> Sent: Wednesday, November 06, 2013 8:34 PM
> To: Cassandra User
> Subject: Re: Config changes to leverage new hardware
>  
> Running Cassandra 1.1.5 currently, but evaluating to upgrade to 1.2.11 soon.
> You will make more use of the extra memory moving to 1.2 as it moves bloom 
> filters and compression data off heap. 
>  
> Also grab the TLAB setting from cassandra-env.sh in v1.2
>  
> As of now, our performance tests (our application specific as well as 
> cassandra-stress) are not showing any significant difference in the 
> hardwares, which is a little disheartening, since the new hardware has a lot 
> more RAM and CPU.
> For reads or writes or both ? 
>  
> Writes tend to scale with cores as long as the commit log can keep up. 
> Reads improve with disk IO and page cache size when the hot set is in memory. 
>  
> Old Hardware: 8 cores (2 quad core), 32 GB RAM, four 1-TB disks ( 1 disk used 
> for commitlog and 3 disks RAID 0 for data)
> New Hardware: 32 cores (2 8-core with hyperthreading), 128 GB RAM, eight 1-TB 
> disks ( 1 disk used for commitlog and 7 disks RAID 0 for data)
> Is the disk IO on the commit log volume keeping up ?
> You cranked up the concurrent writers and the commit log may not keep up. You 
> could put the commit log on the same RAID volume to see if that improves 
> writes. 
>  
> The config we tried modifying so far was concurrent_reads to (16 * number of 
> drives) and concurrent_writes to (8 * number of cores) as per 
> 256 write threads is a lot. Make sure the commit log can keep up, I would put 
> it back to 32, maybe try 64. Not sure the concurrent list for the commit log 
> will work well with that many threads. 
>  
> May want to put the reads down as well. 
>  
> It’s easier to tune the system if you can provide some info on the workload. 
>  
> Cheers
>  
> -----------------
> Aaron Morton
> New Zealand
> @aaronmorton
>  
> Co-Founder & Principal Consultant
> Apache Cassandra Consulting
> http://www.thelastpickle.com
>  
> On 7/11/2013, at 12:35 pm, Arindam Barua <aba...@247-inc.com> wrote:
> 
> 
>  
> We want to upgrade our Cassandra cluster to have newer hardware, and were 
> wondering if anyone has suggestions on Cassandra or linux config changes that 
> will prove to be beneficial.
> As of now, our performance tests (our application specific as well as 
> cassandra-stress) are not showing any significant difference in the 
> hardwares, which is a little disheartening, since the new hardware has a lot 
> more RAM and CPU.
>  
> Old Hardware: 8 cores (2 quad core), 32 GB RAM, four 1-TB disks ( 1 disk used 
> for commitlog and 3 disks RAID 0 for data)
> New Hardware: 32 cores (2 8-core with hyperthreading), 128 GB RAM, eight 1-TB 
> disks ( 1 disk used for commitlog and 7 disks RAID 0 for data)
>  
> Most of the cassandra config currently is the default, and we are using 
> LeveledCompaction strategy. Default key cache, row cache turned off.
> The config we tried modifying so far was concurrent_reads to (16 * number of 
> drives) and concurrent_writes to (8 * number of cores) as per recommendation 
> in cassandra.yaml, but that didn’t make much difference.
> We were hoping that at least the extra RAM in the new hardware will be used 
> for Linux file caching and hence an improvement in performance will be 
> observed.
>  
> Running Cassandra 1.1.5 currently, but evaluating to upgrade to 1.2.11 soon.
>  
> Thanks,
> Arindam

Re: Config changes to leverage new hardware

Reply via email to