[ https://issues.apache.org/jira/browse/CASSANDRA-3248?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13137046#comment-13137046 ]
Brandon Williams commented on CASSANDRA-3248: --------------------------------------------- No, plain SATA drives. > CommitLog writer should call fdatasync instead of fsync > ------------------------------------------------------- > > Key: CASSANDRA-3248 > URL: https://issues.apache.org/jira/browse/CASSANDRA-3248 > Project: Cassandra > Issue Type: Improvement > Components: Core > Affects Versions: 0.6.13, 0.7.9, 0.8.6, 1.0.0, 1.1 > Environment: Linux > Reporter: Zhu Han > Assignee: Brandon Williams > Original Estimate: 48h > Remaining Estimate: 48h > > CommitLogSegment uses SequentialWriter to flush the buffered data to log > device. It depends on FileDescriptor#sync() which invokes fsync() as it force > the file attributes to disk. > However, at least on Linux, fdatasync() is good enough for commit log flush: > bq. fdatasync() is similar to fsync(), but does not flush modified metadata > unless that metadata is needed in order to allow a subsequent data retrieval > to be correctly handled. For example, changes to st_atime or st_mtime > (respectively, time of last access and time of last modification; see > stat(2)) do not require flushing because they are not necessary for a > subsequent data read to be handled correctly. On the other hand, a change to > the file size (st_size, as made by say ftruncate(2)), would require a > metadata flush. > File size is synced to disk by fdatasync() either. Although the commit log > recovery logic sorts the commit log segements on their modify timestamp, it > can be removed safely, IMHO. > I checked the native code of JRE 6. On Linux and Solaris, > FileChannel#force(false) invokes fdatasync(). On windows, the false flag does > not have any impact. > On my log device (commodity SATA HDD, write cache disabled), there is large > performance gap between fsync() and fdatasync(): > {quote} > $sysbench --test=fileio --num-threads=1 --file-num=1 --file-total-size=10G > --file-fsync-all=on --file-fsync-mode={color:red}fdatasync{color} > --file-test-mode=seqwr --max-time=600 --file-block-size=2K --max-requests=0 > run > {color:blue}54.90{color} Requests/sec executed > per-request statistics: > min: 8.29ms > avg: 18.18ms > max: 108.36ms > approx. 95 percentile: 25.02ms > $ sysbench --test=fileio --num-threads=1 --file-num=1 --file-total-size=10G > --file-fsync-all=on --file-fsync-mode={color:red}fsync{color} > --file-test-mode=seqwr --max-time=600 --file-block-size=2K --max-requests=0 > run > {color:blue}28.08{color} Requests/sec executed > per-request statistics: > min: 33.28ms > avg: 35.61ms > max: 911.87ms > approx. 95 percentile: 41.69ms > {quote} > I do think this is a very critical performance improvement. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira