On Thu, Jul 15, 2010 at 11:41 AM, Peter Schuller <peter.schul...@infidyne.com> wrote: > Not really. That is, the intent of mmap is to let the OS dynamically > choose what gets swapped in and out. The practical problem is that the > OS will often tend to swap too much. I got the impression jbellis > wasn't convinced, but my anecdotal experience is that this is a much > larger problem for mmap():ed data than for regular buffer cached data
I'm convinced. :) See comments on https://issues.apache.org/jira/browse/CASSANDRA-1214 > Also, personally I'm interested in hearing what kind of performance > impacts people have *actually* seen with standard I/O; especially if > cassandra is configured to configure a significant amount of data in > RAM itself. I'm a bit skeptical about claims of extreme performance > differences, in spite of syscalls being expensive. The main problem is not the syscall so much as Java insisting on zeroing out any buffer you create, which is a big hit to performance when you're allocating buffers for file i/o on each request instead of just mmaping things. Re-using those buffers would be possible but difficult; I think using mlockall to "fix" the mmap approach is more promising. -- Jonathan Ellis Project Chair, Apache Cassandra co-founder of Riptano, the source for professional Cassandra support http://riptano.com