[ 
https://issues.apache.org/jira/browse/CASSANDRA-3248?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13113466#comment-13113466
 ] 

Zhu Han commented on CASSANDRA-3248:
------------------------------------

It is very strange that there is no different between fdatasync and fsync when 
overwrite an preallocated file. It should be highly correlated with the 
underlying file system implementation. 

IMHO, we should not fix this issue unless we have a thorough test on different 
file systems.

{quote}
$ sysbench --test=fileio --num-threads=1  --file-num=1 --file-total-size=512M 
--file-fsync-all=on --file-fsync-mode=fsync --file-test-mode=seqrewr 
--max-time=600 --file-block-size=2K  --max-requests=0 prepare

$ sysbench --test=fileio --num-threads=1  --file-num=1 --file-total-size=512M 
--file-fsync-all=on --file-fsync-mode={color:red}fsync{color} 
--file-test-mode=seqrewr --max-time=600 --file-block-size=2K  --max-requests=0 
run

Operations performed:  0 Read, 29384 Write, 29384 Other = 58768 Total
Read 0b  Written 57.391Mb  Total transferred 57.391Mb  (97.943Kb/sec)
   48.97 Requests/sec executed

    per-request statistics:
         min:                                 12.94ms
         avg:                                 20.42ms
         max:                                125.02ms
         approx.  95 percentile:              25.02ms

$ sysbench --test=fileio --num-threads=1  --file-num=1 --file-total-size=512M 
--file-fsync-all=on --file-fsync-mode={color:red}fdatasync{color} 
--file-test-mode=seqrewr --max-time=600 --file-block-size=2K  --max-requests=0 
run


Operations performed:  0 Read, 29307 Write, 29307 Other = 58614 Total
Read 0b  Written 57.24Mb  Total transferred 57.24Mb  (97.688Kb/sec)
   48.84 Requests/sec executed

    per-request statistics:
         min:                                 16.21ms
         avg:                                 20.47ms
         max:                                116.69ms
         approx.  95 percentile:              25.02ms


{quote}

> CommitLog writer should call fdatasync instead of fsync
> -------------------------------------------------------
>
>                 Key: CASSANDRA-3248
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-3248
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Core
>    Affects Versions: 0.6.13, 0.7.9, 0.8.6, 1.0.0, 1.1
>         Environment: Linux
>            Reporter: Zhu Han
>   Original Estimate: 48h
>  Remaining Estimate: 48h
>
> CommitLogSegment uses SequentialWriter to flush the buffered data to log 
> device. It depends on FileDescriptor#sync() which invokes fsync() as it force 
> the file attributes to disk.
> However, at least on Linux, fdatasync() is good enough for commit log flush:
> bq. fdatasync() is similar to fsync(), but does not flush modified metadata 
> unless that metadata is needed in order to allow a subsequent data retrieval 
> to be  correctly handled.  For example, changes to st_atime or st_mtime 
> (respectively, time of last access and time of last modification; see 
> stat(2)) do not require flushing because they are not necessary for a 
> subsequent data read to be handled correctly.  On the other hand, a change to 
> the file size (st_size,  as  made  by  say  ftruncate(2)),  would require a 
> metadata flush.
> File size is synced to disk by fdatasync() either. Although the commit log 
> recovery logic sorts the commit log segements on their modify timestamp, it 
> can be removed safely, IMHO.
> I checked the native code of JRE 6. On Linux and Solaris, 
> FileChannel#force(false) invokes fdatasync(). On windows, the false flag does 
> not have any impact.
> On my log device (commodity SATA HDD, write cache disabled), there is large 
> performance gap between fsync() and fdatasync():
> {quote}
> $sysbench --test=fileio --num-threads=1  --file-num=1 --file-total-size=10G 
> --file-fsync-all=on --file-fsync-mode={color:red}fdatasync{color} 
> --file-test-mode=seqwr --max-time=600 --file-block-size=2K  --max-requests=0 
> run
> {color:blue}54.90{color} Requests/sec executed
>    per-request statistics:
>          min:                                  8.29ms
>          avg:                                 18.18ms
>          max:                                108.36ms
>          approx.  95 percentile:              25.02ms
> $ sysbench --test=fileio --num-threads=1  --file-num=1 --file-total-size=10G 
> --file-fsync-all=on --file-fsync-mode={color:red}fsync{color} 
> --file-test-mode=seqwr --max-time=600 --file-block-size=2K  --max-requests=0 
> run
> {color:blue}28.08{color} Requests/sec executed
>     per-request statistics:
>          min:                                 33.28ms
>          avg:                                 35.61ms
>          max:                                911.87ms
>          approx.  95 percentile:              41.69ms
> {quote}
> I do think this is a very critical performance improvement.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to