[jira] Issue Comment Edited: (CASSANDRA-1470) use direct io for compaction

T Jake Luciani (JIRA) Wed, 17 Nov 2010 14:31:42 -0800

    [ 
https://issues.apache.org/jira/browse/CASSANDRA-1470?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12933188#action_12933188
 ]


T Jake Luciani edited comment on CASSANDRA-1470 at 11/17/10 5:29 PM:
---------------------------------------------------------------------

Couple more fixes:

1) Tweaked directio buffer size to be 16k. Gets much better read performance.
2) Another fix for non-jna users

I've done some substantial testing of this patch and I have to say I see no 
effect.  I think the reason is commit log and SSTable writes are entering the 
page cache as well.  I think the next approach would be to make these use 
DirectIO as well.  Back to you Pavel :)

My benchmarking looks like this:

{code}
#Write a lot of data to  CF
python stress.py -S 8192 -n 100000 -k

#triggers minor compaction
#wait till complete....

#read from CF till file cache is primed
python stress.py -n 100000 -k -o read
total,interval_op_rate,interval_key_rate,avg_latency,elapsed_time
1010,101,101,0.5461584162,11
2465,145,145,0.384013745219,22
[cropped]

#primed read
python stress.py -n 100000 -k -o read
total,interval_op_rate,interval_key_rate,avg_latency,elapsed_time
26962,2696,2696,0.0219141756796,11
59567,3260,3260,0.0170967640897,22
97089,3752,3752,0.0146972443858,33
100000,291,290,0.00880796890495,34


#write to another column family
python stress.py -S 8192 -n 100000 -k -y super

#triggers minor compaction of second CF.. wait till complete

#read from first CF again (slowwww)
python stress.py -n 100000 -k -o read
total,interval_op_rate,interval_key_rate,avg_latency,elapsed_time
1010,101,101,0.5461584162,11
2465,145,145,0.384013745219,22
3915,145,145,0.385855172749,33
5683,176,176,0.307302036167,44
7591,190,190,0.291640462241,55
9575,198,198,0.281985886275,66
[cropped]
{code}

      was (Author: tjake):
    Couple more fixes:

1) Tweaked directio buffer size to be 16k. Gets much better read performance.
2) Another fix for non-jna users

I've done some substantial testing of this patch and I have to say I see no 
effect.  I think the reason is commit log and SSTable writes are entering the 
page cache as well.  I think the next approach would be to make these use 
DirectIO as well.  Back to you Pavel :)

My benchmarking looks like this:

{code}
#Write a lot of data to 2 column families
python stress.py -S 8192 -n 100000 -k
python stress.py -S 8192 -n 100000 -k -y super

#read it till file cache is primed
python stress.py -n 100000 -k -o read
total,interval_op_rate,interval_key_rate,avg_latency,elapsed_time
1010,101,101,0.5461584162,11
2465,145,145,0.384013745219,22
[cropped]

#primed read
python stress.py -n 100000 -k -o read
total,interval_op_rate,interval_key_rate,avg_latency,elapsed_time
26962,2696,2696,0.0219141756796,11
59567,3260,3260,0.0170967640897,22
97089,3752,3752,0.0146972443858,33
100000,291,290,0.00880796890495,34

#force compaction
nodetool -h localhost compact Keyspace1

#read (slowwww)
python stress.py -n 100000 -k -o read
total,interval_op_rate,interval_key_rate,avg_latency,elapsed_time
1010,101,101,0.5461584162,11
2465,145,145,0.384013745219,22
3915,145,145,0.385855172749,33
5683,176,176,0.307302036167,44
7591,190,190,0.291640462241,55
9575,198,198,0.281985886275,66
[cropped]
{code}
  
> use direct io for compaction
> ----------------------------
>
>                 Key: CASSANDRA-1470
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-1470
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Core
>            Reporter: Jonathan Ellis
>            Assignee: Pavel Yaskevich
>             Fix For: 0.6.9
>
>         Attachments: 1470-v2.txt, 1470.txt, CASSANDRA-1470-for-0.6.patch, 
> CASSANDRA-1470-v10-for-0.7.patch, CASSANDRA-1470-v11-for-0.7.patch, 
> CASSANDRA-1470-v2.patch, 
> CASSANDRA-1470-v3-0.7-with-LastErrorException-support.patch, 
> CASSANDRA-1470-v4-for-0.7.patch, CASSANDRA-1470-v5-for-0.7.patch, 
> CASSANDRA-1470-v6-for-0.7.patch, CASSANDRA-1470-v7-for-0.7.patch, 
> CASSANDRA-1470-v8-for-0.7.patch, CASSANDRA-1470-v9-for-0.7.patch, 
> CASSANDRA-1470.patch, 
> use.DirectIORandomAccessFile.for.commitlog.against.1022235.patch
>
>
> When compaction scans through a group of sstables, it forces the data in the 
> os buffer cache being used for hot reads, which can have a dramatic negative 
> effect on performance.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Issue Comment Edited: (CASSANDRA-1470) use direct io for compaction

Reply via email to