Although I've never indexed anything quite that large, i've had good experiences with splitting the index out over a cluster. (for example, a set that would be about 4 seconds per complicated query on one of our machines becomes around a second when spread out over 6) I think the reason why this helps is because of the disk I/O speed bounding of performance that the others have mentioned, and how adding another disk array adds to the effective disk bandwidth.
good luck - andy g On Fri, 07 May 2004 04:47:55 +0500, Will Allen <[EMAIL PROTECTED]> wrote: > > Hi, > I am considering a project that would index 315+ million documents. I am > comfortable that the indexing will work well in creating an index ~800GB in size, > but am concerned about the query performance. (Is this a = bad > assumption?) > > What are the bottlenecks of performance as an index scales? Memory? = Cost is not > a concern, so what would be the shortcomings of a theoretical = machine with 16GB of > ram, 4-16 cpus and 1-2 terabytes of space? Would it be = better to cluster machines > to break apart the query? > > Thank you for your serious responses, > Will Allen > -- > ___________________________________________________________ > Sign-up for Ads Free at Mail.com > http://promo.mail.com/adsfreejump.htm > > --------------------------------------------------------------------- > To unsubscribe, e-mail: [EMAIL PROTECTED] > For additional commands, e-mail: [EMAIL PROTECTED] > > --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]