Dumped 50mil records into my 2-node cluster overnight, made sure that
there's not many data files (around 30 only) per Martin's suggestion. The
size of the data directory is 63GB. Now when I read records from the cluster
the read latency is still ~44ms, --there's no write happening during the
read. And iostats shows that the disk (RAID10, 4 250GB 15k SAS) is
saturated:

Device:         rrqm/s   wrqm/s   r/s   w/s   rsec/s   wsec/s avgrq-sz
avgqu-sz   await  svctm  %util
sda              47.67    67.67 190.33 17.00 23933.33   677.33   118.70
5.24   25.25   4.64  96.17
sda1              0.00     0.00  0.00  0.00     0.00     0.00     0.00
0.00    0.00   0.00   0.00
sda2             47.67    67.67 190.33 17.00 23933.33   677.33   118.70
5.24   25.25   4.64  96.17
sda3              0.00     0.00  0.00  0.00     0.00     0.00     0.00
0.00    0.00   0.00   0.00

CPU usage is low.

Does this mean disk i/o is the bottleneck for my case? Will it help if I
increase KCF to cache all sstable index?

Also, this is the almost a read-only mode test, and in reality, our
write/read ratio is close to 1:1 so I'm guessing read latency will even go
higher in that case because there will be difficult for cassandra to find a
good moment to compact the data files that are being busy written.

Thanks,
-Weijun


On Tue, Feb 16, 2010 at 6:06 AM, Brandon Williams <dri...@gmail.com> wrote:

> On Tue, Feb 16, 2010 at 2:32 AM, Dr. Martin Grabmüller <
> martin.grabmuel...@eleven.de> wrote:
>
>> In my tests I have observed that good read latency depends on keeping
>> the number of data files low.  In my current test setup, I have stored
>> 1.9 TB of data on a single node, which is in 21 data files, and read
>> latency is between 10 and 60ms (for small reads, larger read of course
>> take more time).  In earlier stages of my test, I had up to 5000
>> data files, and read performance was quite bad: my configured 10-second
>> RPC timeout was regularly encountered.
>>
>
> I believe it is known that crossing sstables is O(NlogN) but I'm unable to
> find the ticket on this at the moment.  Perhaps Stu Hood will jump in and
> enlighten me, but in any case I believe
> https://issues.apache.org/jira/browse/CASSANDRA-674 will eventually solve
> it.
>
> Keeping write volume low enough that compaction can keep up is one
> solution, and throwing hardware at the problem is another, if necessary.
>  Also, the row caching in trunk (soon to be 0.6 we hope) helps greatly for
> repeat hits.
>
> -Brandon
>

Reply via email to