My 5 cents: I'd check blockdev --getra for data drives - too high values
for readahead (default to 256 for debian) can hurt read performance.
On 05/16/2013 05:14 PM, Keith Wright wrote:
Hi all,
I currently have 2 clusters, one running on 1.1.10 using CQL2 and
one running on 1.2.4 using CQL3 and Vnodes. The machines in the
1.2.4 cluster are expected to have better IO performance as we are
going from 1 SSD data disk per node in the 1.1 cluster to 3 SSD data
disks per node in the 1.2 cluster with higher end drives (commit logs
are on their own disk shared with the OS). I am doing some stress
testing on the 1.2 cluster and have found that although the reads /
sec as seen from iostat are approximately the same (3K / sec) in both
clusters, the MB/s read in the new cluster is MUCH higher (7 MB/s in
1.1 as compared to 30-50 MB/s in 1.2). As a result, I am seeing
excessive iowait in the 1.2 cluster causing high average read times of
30 ms under the same load (1.1 cluster sees around 5 ms). They are
both using Leveled compaction but one thing I did change in the new
cluster was to increase the sstable size from the OOTB setting to 32
MB. Note that my reads are by definition highly random as we are
running memcached in front for various reasons. Does cassandra need
to read the entire SSTable when fetching a row or only the relevant
chunk (I have the OOTB chunk size and BF settings)? I just decreased
the sstable size to 5 MB and am waiting for compactions to complete to
see if that makes a difference.
Thanks!
Relevant table definition if helpful (note that I also changed to the
LZ4 compressor expecting better read performance and I decreased the
crc change again to minimize read latency):
CREATE TABLE global_user (
user_id BIGINT,
app_id INT,
type TEXT,
name TEXT,
last TIMESTAMP,
paid BOOLEAN,
values map<TIMESTAMP,FLOAT>,
sku_time map<TEXT,TIMESTAMP>,
extra_param map<TEXT,TEXT>,
PRIMARY KEY (user_id, app_id, type, name)
) with compression={'crc_check_chance':0.1,'sstable_compression':'LZ4Compressor'}
and
compaction={'class':'LeveledCompactionStrategy'} and
compaction_strategy_options = {'sstable_size_in_mb':5} and
gc_grace_seconds = 86400;