Sorry, wrong email ;) Mike McCandless
http://blog.mikemccandless.com On Tue, Jun 14, 2011 at 8:05 AM, Michael McCandless <luc...@mikemccandless.com> wrote: > Hmm, this sounds hairy :) > > Are you sure NRTCachingDir won't work for you? > > Mike McCandless > > http://blog.mikemccandless.com > > On Tue, Jun 14, 2011 at 5:58 AM, Ganesh <emailg...@yahoo.co.in> wrote: >> Is it a bad idea to keep multiple shards in a single system? >> >> Regards >> Ganesh >> >> ----- Original Message ----- >> From: "Toke Eskildsen" <t...@statsbiblioteket.dk> >> To: <java-user@lucene.apache.org> >> Sent: Tuesday, June 14, 2011 12:58 PM >> Subject: Re: Index size and performance degradation >> >> >>> On Sun, 2011-06-12 at 10:10 +0200, Itamar Syn-Hershko wrote: >>>> The whole point of my question was to find out if and how to make >>>> balancing on the SAME machine. Apparently thats not going to help and at >>>> a certain point we will just have to prompt the user to buy more >>>> hardware... >>> >>> It really depends on your scenario. If you have few concurrent requests >>> and are looking to minimize latency, sharding might help; assuming you >>> have fast IO and multiple cores. You basically want to saturate all >>> available resources for all requests. >>> >>> On the other hand, if throughput is the issue, sharding on a single >>> machine is counter-productive due to increased duplication and merging. >>> >>>> Out of curiosity, isn't there anything that we can do to avoid that? for >>>> instance using memory-mapped files for the indexes? anything that would >>>> help us overcome OS limitations of that sort... >>> >>> One standard advice for speeding up searches is using SSD's. Our >>> (admittedly old) experiments puts SSD-performance near RAM. With the >>> prices we have now, SSD's seems like an obvious choice for most setups. >>> >>> We tried a few performance tests at different index sizes and for us, >>> index size vs. performance looked like the power law: Heavy performance >>> degradation in the beginning, less later. It makes sense when we look at >>> caching and it means that if you do not require stellar performance, you >>> can have very large indexes on few machines (cue Hathi Trust). >>> >>> - Toke Eskildsen >>> >>> >>> --------------------------------------------------------------------- >>> To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org >>> For additional commands, e-mail: java-user-h...@lucene.apache.org >>> >> >> --------------------------------------------------------------------- >> To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org >> For additional commands, e-mail: java-user-h...@lucene.apache.org >> >> > --------------------------------------------------------------------- To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org For additional commands, e-mail: java-user-h...@lucene.apache.org