Hi, interesting discussion. Supose my index now has 1 TB. I splitted into 16 hds (65GB per hd) in the same machine with 16 cores. Use parallelmultisearch it's a good idea for this structure?? Results wil be fast?? Is there a better solution for this structure?
thanks, On Fri, Oct 23, 2009 at 9:33 AM, Toke Eskildsen <t...@statsbiblioteket.dk>wrote: > On Fri, 2009-10-23 at 08:49 +0200, Jake Mannix wrote: > > One of the big problems you'll run into with this index size is that > > you'll never have enough RAM to give your OS's IO cache enough room to > keep > > much of this index in memory, so you're going to be seeking in this > monster > > file a lot. [...] > > Solid State Drives helps a lot in this aspect. We've done experiments > with a 40GB index and adjustments of the amount of RAM available for > file cache. We observed that search-speed using SSD's weren't near as > susceptible to cache-size as conventional harddisks. > > Some quick and fairly unstructured notes on our observations: > http://wiki.statsbiblioteket.dk/summa/Hardware > > > [...] > > This may be mitigated by using really fast disks, possibly, which is yet > > another reason why you'll need to do some performance profiling on a > > variety of sizes with similar-to-production data sets. > > For our setup, a switch from conventional harddisks to SSDs moved the > bottleneck from I/O to CPU/RAM. > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org > For additional commands, e-mail: java-user-h...@lucene.apache.org > > -- Felipe Lobo www.jusbrasil.com.br