[jira] [Issue Comment Edited] (CASSANDRA-3623) use MMapedBuffer in CompressedSegmentedFile.getSegment
[ https://issues.apache.org/jira/browse/CASSANDRA-3623?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13176071#comment-13176071 ] Vijay edited comment on CASSANDRA-3623 at 12/27/11 3:23 AM: Alright i think i found the the missing pieces: 1) Plz reapply v2 from CASSANDRA-3611 (which also depends on CASSANDRA-3610) 2) Plz reapply v3 which has the mark() (this seem to be used by range slice and Stress tool does it). 3) Plz set the CRC chance to 0.0 by update chance - We need to do this before the SST's are created otherwise it wont take into effect. (update statements i used is in the *.doc attached) You might not see any diffrence if it is not set, because thats a big bottleneck. 4) I used SunJDK for the test. The Test Results are attached, let me know in case of any questions... the performance seem to be better. I Used stress test so we are in the same page, and when the Column size or the range of columns to be fetched increases the performance gets better (rebuffers) was (Author: vijay2...@yahoo.com): Alright i think i found the the missing peace: 1) Plz reapply v2 from CASSANDRA-3611 (which also depends on CASSANDRA-3610) 2) Plz reapply v3 which has the mark() (this seem to be used by range slice and Stress tool does it). 3) Plz set the CRC chance to 0.0 by update chance - We need to do this before the SST's are created otherwise it wont take into effect. (update statements i used is in the *.doc attached) You might not see any diffrence if it is not set, because thats a big bottleneck. 4) I used SunJDK for the test. The Test Results are attached, let me know in case of any questions... the performance seem to be better. I Used stress test so we are in the same page, and when the Column size or the range of columns to be fetched increases the performance gets better (rebuffers) > use MMapedBuffer in CompressedSegmentedFile.getSegment > -- > > Key: CASSANDRA-3623 > URL: https://issues.apache.org/jira/browse/CASSANDRA-3623 > Project: Cassandra > Issue Type: Improvement > Components: Core >Affects Versions: 1.1 >Reporter: Vijay >Assignee: Vijay > Labels: compression > Fix For: 1.1 > > Attachments: 0001-MMaped-Compression-segmented-file-v2.patch, > 0001-MMaped-Compression-segmented-file-v3.patch, > 0001-MMaped-Compression-segmented-file.patch, > 0002-tests-for-MMaped-Compression-segmented-file-v2.patch, > 0002-tests-for-MMaped-Compression-segmented-file-v3.patch, CRC+MMapIO.xlsx, > MMappedIO-Performance.docx > > > CompressedSegmentedFile.getSegment seem to open a new file and doesnt seem to > use the MMap and hence a higher CPU on the nodes and higher latencies on > reads. > This ticket is to implement the TODO mentioned in CompressedRandomAccessReader > // TODO refactor this to separate concept of "buffer to avoid lots of read() > syscalls" and "compression buffer" > but i think a separate class for the Buffer will be better. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Issue Comment Edited] (CASSANDRA-3623) use MMapedBuffer in CompressedSegmentedFile.getSegment
[ https://issues.apache.org/jira/browse/CASSANDRA-3623?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13176071#comment-13176071 ] Vijay edited comment on CASSANDRA-3623 at 12/27/11 3:20 AM: Alright i think i found the the missing peace: 1) Plz reapply v2 from CASSANDRA-3611 (which also depends on CASSANDRA-3610) 2) Plz reapply v3 which has the mark() (this seem to be used by range slice and Stress tool does it). 3) Plz set the CRC chance to 0.0 by update chance - We need to do this before the SST's are created otherwise it wont take into effect. (update statements i used is in the *.doc attached) 4) I used SunJDK for the test. The Test Results are attached, let me know in case of any questions... the performance seem to be better. I Used stress test so we are in the same page, and when the Column size or the range of columns to be fetched increases the performance gets better (rebuffers) was (Author: vijay2...@yahoo.com): Alright i think i found the the missing peace: 1) Plz reapply v2 from CASSANDRA-3611 (which also depends on CASSANDRA-3610) 2) Plz reapply v3 which has the mark() (this seem to be used by range slice and Stress tool does it). The Test Results are attached, let me know in case of any questions... the performance seem to be better. I Used stress test so we are in the same page, and when the Column size or the range of columns to be fetched increases the performance gets better (rebuffers) > use MMapedBuffer in CompressedSegmentedFile.getSegment > -- > > Key: CASSANDRA-3623 > URL: https://issues.apache.org/jira/browse/CASSANDRA-3623 > Project: Cassandra > Issue Type: Improvement > Components: Core >Affects Versions: 1.1 >Reporter: Vijay >Assignee: Vijay > Labels: compression > Fix For: 1.1 > > Attachments: 0001-MMaped-Compression-segmented-file-v2.patch, > 0001-MMaped-Compression-segmented-file-v3.patch, > 0001-MMaped-Compression-segmented-file.patch, > 0002-tests-for-MMaped-Compression-segmented-file-v2.patch, > 0002-tests-for-MMaped-Compression-segmented-file-v3.patch, CRC+MMapIO.xlsx, > MMappedIO-Performance.docx > > > CompressedSegmentedFile.getSegment seem to open a new file and doesnt seem to > use the MMap and hence a higher CPU on the nodes and higher latencies on > reads. > This ticket is to implement the TODO mentioned in CompressedRandomAccessReader > // TODO refactor this to separate concept of "buffer to avoid lots of read() > syscalls" and "compression buffer" > but i think a separate class for the Buffer will be better. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Issue Comment Edited] (CASSANDRA-3623) use MMapedBuffer in CompressedSegmentedFile.getSegment
[ https://issues.apache.org/jira/browse/CASSANDRA-3623?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13176071#comment-13176071 ] Vijay edited comment on CASSANDRA-3623 at 12/27/11 3:22 AM: Alright i think i found the the missing peace: 1) Plz reapply v2 from CASSANDRA-3611 (which also depends on CASSANDRA-3610) 2) Plz reapply v3 which has the mark() (this seem to be used by range slice and Stress tool does it). 3) Plz set the CRC chance to 0.0 by update chance - We need to do this before the SST's are created otherwise it wont take into effect. (update statements i used is in the *.doc attached) You might not see any diffrence if it is not set, because thats a big bottleneck. 4) I used SunJDK for the test. The Test Results are attached, let me know in case of any questions... the performance seem to be better. I Used stress test so we are in the same page, and when the Column size or the range of columns to be fetched increases the performance gets better (rebuffers) was (Author: vijay2...@yahoo.com): Alright i think i found the the missing peace: 1) Plz reapply v2 from CASSANDRA-3611 (which also depends on CASSANDRA-3610) 2) Plz reapply v3 which has the mark() (this seem to be used by range slice and Stress tool does it). 3) Plz set the CRC chance to 0.0 by update chance - We need to do this before the SST's are created otherwise it wont take into effect. (update statements i used is in the *.doc attached) 4) I used SunJDK for the test. The Test Results are attached, let me know in case of any questions... the performance seem to be better. I Used stress test so we are in the same page, and when the Column size or the range of columns to be fetched increases the performance gets better (rebuffers) > use MMapedBuffer in CompressedSegmentedFile.getSegment > -- > > Key: CASSANDRA-3623 > URL: https://issues.apache.org/jira/browse/CASSANDRA-3623 > Project: Cassandra > Issue Type: Improvement > Components: Core >Affects Versions: 1.1 >Reporter: Vijay >Assignee: Vijay > Labels: compression > Fix For: 1.1 > > Attachments: 0001-MMaped-Compression-segmented-file-v2.patch, > 0001-MMaped-Compression-segmented-file-v3.patch, > 0001-MMaped-Compression-segmented-file.patch, > 0002-tests-for-MMaped-Compression-segmented-file-v2.patch, > 0002-tests-for-MMaped-Compression-segmented-file-v3.patch, CRC+MMapIO.xlsx, > MMappedIO-Performance.docx > > > CompressedSegmentedFile.getSegment seem to open a new file and doesnt seem to > use the MMap and hence a higher CPU on the nodes and higher latencies on > reads. > This ticket is to implement the TODO mentioned in CompressedRandomAccessReader > // TODO refactor this to separate concept of "buffer to avoid lots of read() > syscalls" and "compression buffer" > but i think a separate class for the Buffer will be better. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Issue Comment Edited] (CASSANDRA-3623) use MMapedBuffer in CompressedSegmentedFile.getSegment
[ https://issues.apache.org/jira/browse/CASSANDRA-3623?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13175595#comment-13175595 ] Vijay edited comment on CASSANDRA-3623 at 12/23/11 10:30 PM: - Hot Methods before the patch (trunk, without any patch): Excl. User CPUName sec. % 1480.474 100.00 756.717 51.11 crc32 387.767 26.19 @0x54999 () 54.814 3.70 org.apache.cassandra.io.compress.CompressedRandomAccessReader.(java.lang.String, org.apache.cassandra.io.compress.CompressionMetadata, boolean) 46.676 3.15 org.apache.cassandra.io.util.RandomAccessReader.(java.io.File, int, boolean) 45.697 3.09 Copy::pd_disjoint_words(HeapWord*, HeapWord*, unsigned long) 39.417 2.66 memcpy 36.931 2.49 @0xd8e9 () 23.272 1.57 CompactibleFreeListSpace::block_size(const HeapWord*) const 22.766 1.54 SpinPause 12.593 0.85 BlockOffsetArrayNonContigSpace::block_start_unsafe(const void*) const 9.304 0.63 CardTableModRefBSForCTRS::card_will_be_scanned(signed char) 8.468 0.57 CardTableModRefBS::non_clean_card_iterate_work(MemRegion, MemRegionClosure*, bool) 8.051 0.54 ParallelTaskTerminator::offer_termination(TerminatorTerminator*) 5.400 0.36 madvise 4.619 0.31 CardTableModRefBS::process_chunk_boundaries(Space*, DirtyCardToOopClosure*, MemRegion, MemRegion, signed char**, unsigned long, unsigned long) 1.584 0.11 CardTableModRefBS::dirty_card_range_after_reset(MemRegion, bool, int) 1.551 0.10 SweepClosure::do_blk_careful(HeapWord*) Hot Methods After the patch: sec. % 537.681 100.00 529.719 98.52 @0x54999 () 4.168 0.78 memcpy 0.143 0.03 0.121 0.02 send 0.121 0.02 sun.misc.Unsafe.park(boolean, long) 0.110 0.02 sun.misc.Unsafe.unpark(java.lang.Object) 0.088 0.02 Interpreter 0.077 0.01 org.apache.cassandra.utils.EstimatedHistogram.max() 0.077 0.01 recv 0.066 0.01 SpinPause 0.055 0.01 org.apache.cassandra.utils.EstimatedHistogram.mean() 0.044 0.01 java.lang.Object.wait(long) 0.044 0.01 org.apache.cassandra.utils.EstimatedHistogram.min() 0.044 0.01 __pthread_cond_signal 0.044 0.01 vtable stub 0.033 0.01 java.lang.Object.notify() 0.033 0.01 java.util.concurrent.ThreadPoolExecutor$Worker.runTask(java.lang.Runnable) 0.033 0.01 org.apache.cassandra.io.compress.CompressedMappedFileDataInput.read() 0.033 0.01 PhaseLive::compute(unsigned) 0.033 0.01 poll 0.022 0.00 Arena::contains(const void*) const 0.022 0.00 CompactibleFreeListSpace::free() const 0.022 0.00 I2C/C2I adapters 0.022 0.00 IndexSetIterator::advance_and_next() 0.022 0.00 java.lang.Class.forName0(java.lang.String, boolean, java.lang.ClassLoader) 0.022 0.00 java.lang.Long.getChars(long, int, char[]) 0.022 0.00 java.nio.Bits.swap(int) Before this patch response times (With crc chance set to 0): Epoch Rds/s RdLat Wrts/s WrtLat %user %sys %idle %iowait %steal md0r/s w/s rMB/s wMB/s NetRxKb NetTxKb Percentiles ReadWrite Compacts 1324587443 15 186.305 00.000 27.85 0.0271.83 0.24 0.053.890.000.120.0041 45 99th 545.791 ms 95th 454.826 ms 99th 0.00 ms95th 0.00 msPen/0 1324587455 15 1142.712 00.000 39.55 0.1357.61 2.50 0.21118.30 0.302.200.0034 36 99th 8409.007 ms95th 8409.007 ms99th 0.00 ms95th 0.00 msPen/0 1324587467 10 171.808 00.000 23.83 0.0476.05 0.04 0.054.800.000.140.00127 33 99th 454.826 ms 95th 315.852 ms 99th 0.00 ms95th 0.00 msPen/0 1324587478 10 182.775 00.000 20.43 0.0479.47 0.01 0.051.600.400.040.0030 37 99th 379.022 ms 95th 379.022 ms 99th 0.00 ms95th 0.00 msPen/0 1324587490 13 190.893 00.000 27.58 0.0372.20 0.14 0.063.200.500.090.0039 42 99th 545.791 ms 95th 379.022 ms 99th 0.00 ms95th 0.00 msPen/0 1324587503 28 358.719 00.000 52.24 0.0846.20 1.40 0.09159.40 0.003.160.00196 71 99th 3379.391 ms95th 943.127 ms 99th 0.00 ms95th 0.00 msPen/0 1324587517 13 194.281 00.000 16.68 0.0283.23 0.04 0.022.400.300.070.0038 41 99th 785.939 ms 95th 545.791 ms 99th 0.00 ms95th 0.00 msPen/0 1324587535 36 662.410 00.000 58.34 0.0841.42 0.06 0.103.600.200.110.00173 81 99th 3379.391 ms95th 2816.159 m