[ https://issues.apache.org/jira/browse/CASSANDRA-1470?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12970232#action_12970232 ]
Peter Schuller commented on CASSANDRA-1470: ------------------------------------------- @jake: Pretty good idea to combine the two like this. It especially works if the new pages written can get intelligently pulled in (or rather "not dropped"). A few things: (1) In order for DONTNEED to be effective you have to fsync() (well, fdatasync on Linux()) first. This will have similar performance implications as direct I/O (see my long post earlier on in this ticket too), but at least removes the need to carefully ensure writes happen in chunks (but instead fsync() frequency will have to be considered and traded). (2) Remember that DONTNEED will affect the data globally for the system; meaning that a compaction that reads and does DONTNEED will actively active data from sstables being actively used. (Again see my longer post earlier in this issue). So you'd have to use mincore() when reading too in order to avoid evicting actively used data. (Note: Not doing so may be *worse* than current behavior, in addition to not causing an improvement, so I think this is important.) But given that those are eventually addressed it seems mincore+advise seems like a pretty good combination. One issue I can think of is that while mincore() gives you information in bulk for many pages, posix_fadvise() does not allow the equivalent. So we'd expect potentially quite a large number of posix_fadvise() calls assuming in-core data is scattered across a large file. That might be significant in some cases (e.g. if half of pages are in core, you may end up approaching a posix_fadvise() per page read). > use direct io for compaction > ---------------------------- > > Key: CASSANDRA-1470 > URL: https://issues.apache.org/jira/browse/CASSANDRA-1470 > Project: Cassandra > Issue Type: Improvement > Components: Core > Reporter: Jonathan Ellis > Assignee: Pavel Yaskevich > Fix For: 0.7.1 > > Attachments: 1470-v2.txt, 1470.txt, CASSANDRA-1470-for-0.6.patch, > CASSANDRA-1470-v10-for-0.7.patch, CASSANDRA-1470-v11-for-0.7.patch, > CASSANDRA-1470-v12-0.7.patch, CASSANDRA-1470-v2.patch, > CASSANDRA-1470-v3-0.7-with-LastErrorException-support.patch, > CASSANDRA-1470-v4-for-0.7.patch, CASSANDRA-1470-v5-for-0.7.patch, > CASSANDRA-1470-v6-for-0.7.patch, CASSANDRA-1470-v7-for-0.7.patch, > CASSANDRA-1470-v8-for-0.7.patch, CASSANDRA-1470-v9-for-0.7.patch, > CASSANDRA-1470.patch, > use.DirectIORandomAccessFile.for.commitlog.against.1022235.patch > > > When compaction scans through a group of sstables, it forces the data in the > os buffer cache being used for hot reads, which can have a dramatic negative > effect on performance. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.