[ https://issues.apache.org/jira/browse/CASSANDRA-1470?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12933188#action_12933188 ]
T Jake Luciani edited comment on CASSANDRA-1470 at 11/17/10 5:29 PM: --------------------------------------------------------------------- Couple more fixes: 1) Tweaked directio buffer size to be 16k. Gets much better read performance. 2) Another fix for non-jna users I've done some substantial testing of this patch and I have to say I see no effect. I think the reason is commit log and SSTable writes are entering the page cache as well. I think the next approach would be to make these use DirectIO as well. Back to you Pavel :) My benchmarking looks like this: {code} #Write a lot of data to CF python stress.py -S 8192 -n 100000 -k #triggers minor compaction #wait till complete.... #read from CF till file cache is primed python stress.py -n 100000 -k -o read total,interval_op_rate,interval_key_rate,avg_latency,elapsed_time 1010,101,101,0.5461584162,11 2465,145,145,0.384013745219,22 [cropped] #primed read python stress.py -n 100000 -k -o read total,interval_op_rate,interval_key_rate,avg_latency,elapsed_time 26962,2696,2696,0.0219141756796,11 59567,3260,3260,0.0170967640897,22 97089,3752,3752,0.0146972443858,33 100000,291,290,0.00880796890495,34 #write to another column family python stress.py -S 8192 -n 100000 -k -y super #triggers minor compaction of second CF.. wait till complete #read from first CF again (slowwww) python stress.py -n 100000 -k -o read total,interval_op_rate,interval_key_rate,avg_latency,elapsed_time 1010,101,101,0.5461584162,11 2465,145,145,0.384013745219,22 3915,145,145,0.385855172749,33 5683,176,176,0.307302036167,44 7591,190,190,0.291640462241,55 9575,198,198,0.281985886275,66 [cropped] {code} was (Author: tjake): Couple more fixes: 1) Tweaked directio buffer size to be 16k. Gets much better read performance. 2) Another fix for non-jna users I've done some substantial testing of this patch and I have to say I see no effect. I think the reason is commit log and SSTable writes are entering the page cache as well. I think the next approach would be to make these use DirectIO as well. Back to you Pavel :) My benchmarking looks like this: {code} #Write a lot of data to 2 column families python stress.py -S 8192 -n 100000 -k python stress.py -S 8192 -n 100000 -k -y super #read it till file cache is primed python stress.py -n 100000 -k -o read total,interval_op_rate,interval_key_rate,avg_latency,elapsed_time 1010,101,101,0.5461584162,11 2465,145,145,0.384013745219,22 [cropped] #primed read python stress.py -n 100000 -k -o read total,interval_op_rate,interval_key_rate,avg_latency,elapsed_time 26962,2696,2696,0.0219141756796,11 59567,3260,3260,0.0170967640897,22 97089,3752,3752,0.0146972443858,33 100000,291,290,0.00880796890495,34 #force compaction nodetool -h localhost compact Keyspace1 #read (slowwww) python stress.py -n 100000 -k -o read total,interval_op_rate,interval_key_rate,avg_latency,elapsed_time 1010,101,101,0.5461584162,11 2465,145,145,0.384013745219,22 3915,145,145,0.385855172749,33 5683,176,176,0.307302036167,44 7591,190,190,0.291640462241,55 9575,198,198,0.281985886275,66 [cropped] {code} > use direct io for compaction > ---------------------------- > > Key: CASSANDRA-1470 > URL: https://issues.apache.org/jira/browse/CASSANDRA-1470 > Project: Cassandra > Issue Type: Improvement > Components: Core > Reporter: Jonathan Ellis > Assignee: Pavel Yaskevich > Fix For: 0.6.9 > > Attachments: 1470-v2.txt, 1470.txt, CASSANDRA-1470-for-0.6.patch, > CASSANDRA-1470-v10-for-0.7.patch, CASSANDRA-1470-v11-for-0.7.patch, > CASSANDRA-1470-v2.patch, > CASSANDRA-1470-v3-0.7-with-LastErrorException-support.patch, > CASSANDRA-1470-v4-for-0.7.patch, CASSANDRA-1470-v5-for-0.7.patch, > CASSANDRA-1470-v6-for-0.7.patch, CASSANDRA-1470-v7-for-0.7.patch, > CASSANDRA-1470-v8-for-0.7.patch, CASSANDRA-1470-v9-for-0.7.patch, > CASSANDRA-1470.patch, > use.DirectIORandomAccessFile.for.commitlog.against.1022235.patch > > > When compaction scans through a group of sstables, it forces the data in the > os buffer cache being used for hot reads, which can have a dramatic negative > effect on performance. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.