Too many open files [was Re: Minimizing the impact of compaction on latency and throughput]

2010-07-14 Thread Thomas Downing
On 7/13/2010 9:20 AM, Jonathan Ellis wrote: On Tue, Jul 13, 2010 at 4:19 AM, Thomas Downing tdown...@proteus-technologies.com wrote: On a related note: I am running some feasibility tests looking for high ingest rate capabilities. While testing Cassandra the problem I've encountered is

Re: Minimizing the impact of compaction on latency and throughput

2010-07-13 Thread Terje Marthinussen
(2) posix_fadvise() feels more obscure and less portable than O_DIRECT, the latter being well-understood and used by e.g. databases for a long time. Due to the need for doing data alignment in the application itself (you are bypassing all the OS magic here), there is really nothing portable

Re: Minimizing the impact of compaction on latency and throughput

2010-07-13 Thread Peter Schuller
Due to the need for doing data alignment in the application itself (you are bypassing all the OS magic here), there is really nothing portable about O_DIRECT. Just have a look at open(2) on linux: [snip] So, just within Linux you got different mechanisms for this depending on kernel and fs

Re: Minimizing the impact of compaction on latency and throughput

2010-07-13 Thread Jonathan Ellis
On Tue, Jul 13, 2010 at 4:19 AM, Thomas Downing tdown...@proteus-technologies.com wrote: On a related note:  I am running some feasibility tests looking for high ingest rate capabilities.  While testing Cassandra the problem I've encountered is that it runs out of file handles during

Re: Minimizing the impact of compaction on latency and throughput

2010-07-13 Thread Terje Marthinussen
On Tue, Jul 13, 2010 at 10:26 PM, Jonathan Ellis jbel...@gmail.com wrote: I'm totally fine with saying Here's a JNI library for Linux [or even Linux version = 2.6.X] since that makes up 99% of our production deployments, and leaving the remaining 1% with the status quo. You really need to

Re: Minimizing the impact of compaction on latency and throughput

2010-07-12 Thread Jonathan Ellis
This looks relevant: http://chbits.blogspot.com/2010/06/lucene-and-fadvisemadvise.html (see comments for directions to code sample) On Fri, Jul 9, 2010 at 1:52 AM, Peter Schuller peter.schul...@infidyne.com wrote: It might be worth experimenting with posix_fadvise.  I don't think implementing

Re: Minimizing the impact of compaction on latency and throughput

2010-07-09 Thread Peter Schuller
It might be worth experimenting with posix_fadvise.  I don't think implementing our own i/o scheduler or rate-limiter would be as good a use of time (it sounds like you're on that page too). Ok. And yes I mostly agree, although I can imagine circumstances where a pretty simple rate limiter

Re: Minimizing the impact of compaction on latency and throughput

2010-07-07 Thread Peter Schuller
This makes sense, but from what I have seen, read contention vs cassandra is a much bigger deal than write contention (unless you don't have a separate device for your commitlog, but optimizing for that case isn't one of our goals). I am not really concerned with write performance, but rather

Re: Minimizing the impact of compaction on latency and throughput

2010-07-07 Thread Jonathan Ellis
On Wed, Jul 7, 2010 at 4:57 PM, Peter Schuller peter.schul...@infidyne.com wrote: This makes sense, but from what I have seen, read contention vs cassandra is a much bigger deal than write contention (meant read contention vs compaction) I am not really concerned with write performance, but