[ 
https://issues.apache.org/jira/browse/CASSANDRA-18464?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17738258#comment-17738258
 ] 

Amit Pawar commented on CASSANDRA-18464:
----------------------------------------

 

[~aweisberg] thanks for sharing the initial testcase and it helped lot. 
Following results are obtained by testing MMap and DirectIO based segments 
(with Native API's and JNA based).

 

++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
+                                             |          |  Testcase     |  
Combined   |                             |        Testlog            |          
                                        +
+                                             | Run   |   RunTime   |  Size in 
GB    |    Improvement    |      Fille Name         |   High Runtime ?          
              +
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
| 1. MemoryMappedSegment   |  r1    |   3m 56s     |   ~74.16      |            
-                |  MMap_r1.log         |    Baseline                           
       |
|                                               |  r2    |   3m 51s     |   
~69.13      |            -                |  MMap_r2.log         |              
                                    |
|                                               |  r3    |   4m 7s       |   
~76.16      |            -                |  MMap_r3.log         |              
                                    |
----------------------------------------------------------------------------------------------------------------------------------------------
| 2. DirectIOSegment                |  r1    |    6m 41s    |   ~178.94    |    
   ~2.34x            | DIO_JNA_r1.log      | Due to replay of Commitlog       |
|    using JNA [1]                       |  r2    |    7m 5s      |   ~190.44   
 |       ~2.50x            | DIO_JNA_r2.log      | files whose total size is    
           |
|                                               |  r3    |    6m 56s    |   
~188.09    |        ~2.46x           | DIO_JNA_r3.log      | more than 180GB    
                  |
----------------------------------------------------------------------------------------------------------------------------------------------
| 3. DirectIOSegment using       |   r1   |   6m 56s     |   ~184.94    |       
 ~2.43x          | DIO_Native_r1.log   | Due to reply of all Commitlog    |
|    Native API's                         |  r2    |   6m 45s     |   ~177.09   
 |        ~2.33x          | DIO_Native_r2.log   | files and its size is more    
         |
|                                               |  r3    |   6m 59s     |   
~186.03    |        ~2.44x          | DIO_Native_r3.log   | than 180GB          
                     |
----------------------------------------------------------------------------------------------------------------------------------------------

Files used are:
 # Could not test Native API's based implementation as CommitLog file size was 
zero when created. Applying [^SetCommitLogFileSize.patch] helped to fix this.
 # PeriodicCommitLogStressTest configured to run for 100 seconds to create 
commitlog files. After the test it reports total disk space used by all the 
CommitLog files. Used this metric to show the IO speed. Please use 
[^PeriodicCommitLogStressTest.tar.bz2]  to refer the test logs.
 # [^CommitLogStressTest.patch] contains necessary changes further to run the 
test. New function is defined to get the status of Direct-IO feature.

 

Note:

[1] File open flags O_DSYNC & O_SYNC can be used to update the file metadata. 
O_DSYNC is already used and didn't see much difference with and without O_SYNC 
usage.

Observation: Linux dstat metrics shows
 # MemoryMappedSegment segment
 ## Data flush rate is 700-1200 MB/s. During this testing system is really not 
loaded
 ## In real scenario it is in the range of 300-500 MB/s.
 ## MemoryMappedSegment segment
 # DirectIO segment
 ## JNA based implementation
 ### Data flush rate is 1800-2000 MB/s.
 ### In real testing it is in the range of 800 - 1100 MB/s.
 ## Using Native API's implementation
 ### Data flush rate is 1800-2000 MB/s. Similar to JNA based.
 ### In real testing it is in the range of 700 - 900 MB/s.

Please let me know your feedback on this. Thanks.

> Enable Direct I/O For CommitLog Files
> -------------------------------------
>
>                 Key: CASSANDRA-18464
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-18464
>             Project: Cassandra
>          Issue Type: New Feature
>          Components: Local/Commit Log
>            Reporter: Josh McKenzie
>            Assignee: Amit Pawar
>            Priority: Normal
>             Fix For: 5.x
>
>         Attachments: CommitLogStressTest.patch, 
> EnableDirectIOForCommitLogUsingNativeAPI.patch, 
> PeriodicCommitLogStressTest.tar.bz2, SetCommitLogFileSize.patch, 
> UseDirectIOFeatureForCommitLogFiles.patch
>
>
> Relocating from [dev@ email 
> thread.|https://lists.apache.org/thread/j6ny17q2rhkp7jxvwxm69dd6v1dozjrg]
>  
> I shared my investigation about Commitlog I/O issue on large core count 
> system in my previous email dated July-22 and link to the thread is given 
> below.
> [https://lists.apache.org/thread/xc5ocog2qz2v2gnj4xlw5hbthfqytx2n]
> Basically, two solutions looked possible to improve the CommitLog I/O.
>  # Multi-threaded syncing
>  # Using Direct-IO through JNA
> I worked on 2nd option considering the following benefit compared to the 
> first one
>  # Direct I/O read/write throughput is very high compared to non-Direct I/O. 
> Learnt through FIO benchmarking.
>  # Reduces kernel file cache uses which in-turn reduces kernel I/O activity 
> for Commitlog files only.
>  # Overall CPU usage reduced for flush activity. JVisualvm shows CPU usage < 
> 30% for Commitlog syncer thread with Direct I/O feature
>  # Direct I/O implementation is easier compared to multi-threaded
> As per the community suggestion, less in code complex is good to have. Direct 
> I/O enablement looked promising but there was one issue. 
> Java version 8 does not have native support to enable Direct I/O. So, JNA 
> library usage is must. The same implementation should also work across other 
> versions of Java (like 11 and beyond).
> I have completed Direct I/O implementation and summary of the attached patch 
> changes are given below.
>  # This implementation is not using Java file channels and file is opened 
> through JNA to use Direct I/O feature.
>  # New Segment are defined named “DirectIOSegment”  for Direct I/O and 
> “NonDirectIOSegment” for non-direct I/O (NonDirectIOSegment is test purpose 
> only).
>  # JNA write call is used to flush the changes.
>  # New helper functions are defined in NativeLibrary.java and platform 
> specific file. Currently tested on Linux only.
>  # Patch allows user to configure optimum block size  and alignment if 
> default values are not OK for CommitLog disk.
>  # Following configuration options are provided in Cassandra.yaml file
> a. use_jna_for_commitlog_io : to use jna feature
> b. use_direct_io_for_commitlog : to use Direct I/O feature.
> c. direct_io_minimum_block_alignment: 512 (default)
> d. nvme_disk_block_size: 32MiB (default and can be changed as per the 
> required size)
>  Test matrix is complex so CommitLog related testcases and TPCx-IOT benchmark 
> was tested. It works with both Java 8 and 11 versions. Compressed and 
> Encrypted based segments are not supported yet and it can be enabled later 
> based on the Community feedback.
>  Following improvement are seen with Direct I/O enablement.
>  # 32 cores >= ~15%
>  # 64 cores >= ~80%
>  Also, another observation would like to share here. Reading Commitlog files 
> with Direct I/O might help in reducing node bring-up time after the node 
> crash.
>  Tested with commit ID: 91f6a9aca8d3c22a03e68aa901a0b154d960ab07
>  The attached patch enables Direct I/O feature for Commitlog files. Please 
> check and share your feedback.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

Reply via email to